Mirabito - Ata Gunaydin; Elif Yildrim; Ozguven Salih; Same Genc;: Difference between revisions
No edit summary (change visibility) |
|||
| (8 intermediate revisions by the same user not shown) | |||
| Line 105: | Line 105: | ||
Tune model weights and perform end-to-end integration testing. |
Tune model weights and perform end-to-end integration testing. |
||
== Week 6 10/20/2025 == |
|||
'''Attendance:''' Samet GENC, Elif YILDIRIM, Ata GUNAYDIN, Salih OZGUVEN |
|||
'''Summary:''' |
|||
This week marked significant progress in data refinement and feature integration across both teams. The Data Infrastructure group focused on extensive data cleaning to improve efficiency and model readiness, while the AI Loyalty Engine team implemented dynamic promotional logic and refined customer segmentation. The teams also collaborated to align datasets and ensure seamless system integration, paving the way for advanced analytical modeling in the coming weeks. |
|||
'''Accomplishments:''' |
|||
'''Data Infrastructure Team:''' |
|||
Performed large-scale data cleaning by removing ''deposit/fees, fuel purchases, age verification, non-merchandise items, and car wash'' entries. |
|||
Successfully reduced dataset size by more than half while maintaining data consistency and analytical depth. |
|||
Applied previously tested feature engineering and time-series transformations to the real dataset. |
|||
Implemented association rule mining using the ''Apriori algorithm'' to identify frequently co-purchased product combinations. |
|||
Conducted exploratory analysis to examine purchasing traffic before, during, and after holidays. |
|||
'''AI Loyalty Engine Team:''' |
|||
Integrated the system with the newly cleaned and standardized dataset from the Data Infrastructure team. |
|||
Handled missing values, unified data formats, and validated readiness for modeling. |
|||
Implemented holiday- and season-based promotional logic within the system. |
|||
Designed adaptive mechanisms for dynamically adjusting promotional campaigns based on time-sensitive factors such as holidays and seasonal demand. |
|||
Categorized customers based on engagement activity into segments including ''Active, At-Risk,'' and ''Churn.'' |
|||
'''To-Do:''' |
|||
Continue optimizing data quality and explore further feature correlations for deeper insights. |
|||
Fine-tune association rule thresholds and integrate top product combinations into campaign design. |
|||
== Week 7 10/27/2025 == |
|||
'''Attendance:''' Samet GENC, Elif YILDIRIM, Ata GUNAYDIN, Salih OZGUVEN |
|||
'''Summary:''' |
|||
'''Accomplishments:''' |
|||
'''AI Loyalty Engine Team:''' |
|||
* Implemented and integrated new machine learning algorithms to enhance customer segmentation within Promos. |
|||
* The added models were trained on the existing datasets to improve segmentation accuracy and cluster interpretability. |
|||
* Within Promos, product names are now properly mapped and displayed, ensuring more consistent data representation and facilitating model evaluation. |
|||
* Preliminary tests indicate improved segmentation coherence and better alignment between predicted clusters and actual customer behavior patterns. |
|||
'''Data Infrastructure Team:''' |
|||
* Began running the developed data preprocessing and basket analysis programs on real and cleaned datasets. |
|||
* Gained new, accurate insights into product relationships within transactions. |
|||
* Identified and documented several areas in need of data correction and refinement for better modeling performance. |
|||
'''To-Do:''' |
|||
* Continue improving data pipelines and validate basket analysis findings. |
|||
* Provide updated datasets to the AI team for model retraining. |
|||
* Conduct detailed evaluation of the new segmentation models using validation metrics (e.g., silhouette score, cohesion). |
|||
== Week 8 10/28/2025 == |
|||
'''Attendance:''' Samet GENC, Elif YILDIRIM, Ata GUNAYDIN, Salih OZGUVEN |
|||
'''Summary:''' |
|||
This week, both teams advanced their respective components of the project, transitioning to real data analysis and refining documentation for customer analytics. |
|||
The AI Loyalty Engine team focused on real-time behavior analysis and dynamic metric visualization, while the Data Infrastructure team ensured data integrity and expanded documentation to align with advanced analytical capabilities. |
|||
'''Accomplishments:''' |
|||
'''AI Loyalty Engine Team:''' |
|||
* Transitioned from mock data to **real data analysis** for improved reliability and insight. |
|||
* Developed a **dynamic ReactJS dashboard** for automatic metric updates and behavioral comparisons. |
|||
* Analyzed **Active, At-Risk, and Churn** customer groups to evaluate model adaptability and promotional impact. |
|||
* Implemented a **dynamic discount mechanism** and explored additional churn indicators beyond AOV to enhance customer retention understanding. |
|||
'''Data Infrastructure Team:''' |
|||
* Conducted in-depth sales analysis and identified irrelevant data connections that caused inaccurate correlations. |
|||
* Updated `README_transactions.md` to include the latest **analysis report**. |
|||
* Expanded **Business Applications** with **Customer Retention & Engagement** and **Marketing & Campaigns** sections. |
|||
* Updated **Technical Achievements** to include **customer analytics metrics**, ensuring stronger alignment between data insights and project documentation. |
|||
'''To-Do:''' |
|||
* Validate cleaned datasets and confirm alignment between sales and behavioral data. |
|||
* Integrate the refined data pipeline with the AI team’s dashboard for unified analysis. |
|||
* Finalize documentation for the next milestone presentation and cross-team demo. |
|||
== Week 9 11/04/2025 == |
|||
'''Attendance:''' Samet GENC, Elif YILDIRIM, Ata GUNAYDIN, Salih OZGUVEN |
|||
'''Summary:''' |
|||
This week, the AI Loyalty Engine team focused on improving the accuracy of business metrics in the analytics dashboard by fixing ROI and dataset comparison logic. |
|||
The Data Infrastructure team built a new, consolidated analysis notebook (data_cleaning / comprehensive customer & basket analysis) that loads the February 2025 transaction data, runs all preprocessing, and performs end-to-end customer, discount, and basket analytics in one place. This will be used as the main source for customer-level insights and for feeding the AI team. |
|||
'''Accomplishments:''' |
|||
'''AI Loyalty Engine Team:''' |
|||
* Corrected the ROI calculation so campaign performance reflects real discount cost vs. revenue. |
|||
* Fixed the comparison logic between different datasets so dashboard numbers don’t mismatch when inputs change. |
|||
* Ensured the dashboard updates dynamically when new/cleaned data is provided by the data team. |
|||
'''Data Infrastructure Team:''' |
|||
* Created a single consolidated notebook that: |
|||
* Loads and combines all combined transaction CSVs. |
|||
* Cleans numeric/date fields and standardizes loyalty/customer IDs. |
|||
* Adds time-based features (week, year–month, day/time-of-day). |
|||
* Performed customer-level analysis (unique customers, activity, AOV, inactivity, weekly/monthly trends). |
|||
* Built a full RFM-based customer segmentation (Champions, Loyal, At Risk, Lost, etc.) and tracked weekly segment transitions. |
|||
* Analyzed discount and promotion effectiveness (which discount levels bring best revenue per $1 discount, which segments respond more). |
|||
* Generated visualizations and an executive summary inside the notebook so other team members can review results quickly. |
|||
'''To-Do:''' |
|||
* Validate the consolidated notebook with newer months (not just February) to make it a reusable pipeline. |
|||
* Pass the cleaned/segmented output to the AI team so the dashboard can show segment-level promo performance. |
|||
* Integrate top association rules into campaign design (bundle offers / cross-sell suggestions). |
|||
== Week 10 11/11/2025 == |
|||
'''Attendance:''' Samet GENC, Elif YILDIRIM, Ata GUNAYDIN, Salih OZGUVEN |
|||
'''Summary:'''
|
|||
This week, both teams continued building upon the February and March datasets to expand customer segmentation and improve analytical accuracy.
The Data Infrastructure Team applied the comprehensive RFM segmentation process to March 2025 transaction data, generating detailed segment distributions and weekly transition trends. This work provides a foundation for tracking customer lifecycle movements and evaluating marketing effectiveness.
Meanwhile, the AI Loyalty Engine Team collaborated with the data team to integrate the expanded customer segments into the dashboard. Segment transition data and behavioral trends were visualized, allowing dynamic tracking of customer movement between segments. |
|||
'''Accomplishments:''' |
|||
'''Data Infrastructure Team:''' |
|||
* Applied full RFM segmentation pipeline to March 2025 data. |
|||
* Generated detailed weekly and monthly segment transition analyses. |
|||
* Updated documentation explaining segment logic and business implications. |
|||
'''AI Loyalty Engine Team:''' |
|||
* Integrated new customer segments and transition metrics into the dashboard. |
|||
* Enhanced analysis view to visualize customer movement between segments. |
|||
* Synced dashboard data with the latest outputs from the data team. |
|||
'''To-Do:''' |
|||
* Validate segment consistency across February and March datasets. |
|||
* Finalize integration for real-time segment updates in the dashboard. |
|||
* Begin preparing a summary visualization for the mid-term report. |
|||
Latest revision as of 04:05, 18 November 2025
Week 1 09/09/2025
Attendance: Samet GENC, Elif YILDIRIM, Ata GUNAYDIN, Salih OZGUVEN
Summary: This week, we held our initial project meeting and outlined the scope of the two interconnected projects: the AI Loyalty Engine and the Data Infrastructure Modeling. We reviewed technical requirements, development tools, and environment setup instructions, including Node.js, Angular, and C#. Additionally, we discussed laptop preferences and extension configurations to ensure a smooth development workflow across platforms.
Accomplishments:
- Defined project responsibilities: AI Loyalty Engine vs. Data Infrastructure Modeling.
- Reviewed system architecture and collaboration needs between both groups.
- Shared setup instructions for Node.js, Angular CLI, and C#.
- Established development environment recommendations for Windows and Mac (Visual Studio / VS Code).
To-Do:
- Install required tools (Node.js, Angular CLI, VS/VS Code with extensions).
- Coordinate regular check-ins between both project teams.
Week 2 & Week 3 09/16/2025
Attendance: Samet GENC, Elif YILDIRIM, Ata GUNAYDIN, Salih OZGUVEN
Summary: This week, we focused on reviewing the provided datasets and understanding their features. We examined both the Customer Basket and Inventory data to explore potential correlations and identify missing elements necessary for modeling. Additionally, we analyzed the AI engine code from the repositories to gain insights into the existing implementation. Finally, we held discussions with Kaan Balta from last year’s team, who provided observations and context regarding the codebase and its structure.
Accomplishments:
- Reviewed Customer Basket dataset and Inventory data, working to identify correlations between them.
- Explored dataset features and clarified column definitions, missing values, and potential resolutions.
- Investigated the AI Loyalty Engine source code and associated programs in the repositories.
To-Do:
- Draft scope documents outlining tasks, timelines, milestones, constraints, and deliverables.
- Propose solutions for gaps identified in the Inventory dataset.
- Continue deep-diving into feature relationships between datasets to refine modeling strategy.
- Begin aligning Data Infrastructure outputs with AI Loyalty Engine requirements for integration.
Week 4 10/06/2025
Attendance: Samet GENC, Elif YILDIRIM, Ata GUNAYDIN, Salih OZGUVEN
Summary: This week, both teams made significant progress in shaping the foundation for the upcoming development phases. The Data Infrastructure team focused on creating a sustainable and refined mock dataset to support the AI Loyalty Engine. All columns and their importance were discussed in detail, and several new features (e.g., `expiration_date`, `seasons`, `day_time`) were engineered. Features with unclear functionality were noted for expert consultation. Meanwhile, the AI Loyalty Engine team drafted improvement options for the model, researched multiple algorithmic approaches, and mapped key integration points within the existing system. Additionally, both teams collaborated to prepare a comprehensive Project Scope Document, clarifying requirements, milestones, and objectives for the semester.
Accomplishments:
Data Infrastructure Team:
- Initiated the creation of a sustainable mock dataset for AI engine development.
- Discussed and evaluated all dataset columns based on their relevance to the model.
- Designed new features (`expiration_date`, `seasons`, `day_time`) to enhance predictive capacity.
- Identified uncertain features for expert validation.
AI Loyalty Engine Team:
- Drafted Improvement Options v0.1, researched potential models, and mapped code/extension points.
- Explored a range of candidate models: ALS, BPR-MF, LightGCN, SAR, Two-Tower , SASRec/BERT4Rec, LightGBM/XGBoost, DeepFM/DLRM, LambdaMART, ILP/Knapsack, and Uplift/CATE approaches.
- Created key artifacts including: Improvements Document, Model Survey, and Code Map.
- Defined project scope: AI engine will propose weekly campaign offers for Marketing (not real-time per user).
Collaboration:
- Conducted a joint session to document requirements and structure a unified project plan.
- Developed a detailed Project Scope Document outlining tasks, deliverables, and shared milestones.
To-Do:
Data Infrastructure Team:
- Finalize the feature schema (expiry, inventory, margin, supplier funds).
- Validate mock dataset quality and integrate with AI engine inputs.
AI Loyalty Engine Team:
- Prototype Two-Tower retrieval model and train LightGBM baseline.
- Draft portfolio optimizer and design initial geo A/B testing plan.
General:
- Address open risks, ensure reliability of inventory/expiry data, confirm store clustering and segmentations, and align objectives on profit uplift and waste reduction.
Week 5 10/13/2025
Attendance: Samet GENC, Elif YILDIRIM, Ata GUNAYDIN, Salih OZGUVEN
Summary: This week, the Data Infrastructure team validated refined transaction data and finalized schema definitions along with ER diagram, while the AI Loyalty Engine team completed the hybrid model integration and dashboard setup.
Accomplishments:
Data Infrastructure Team:
Implemented a data checklist script for structure, numeric, and key integrity checks.
Finalized the POS–Inventory One-Page Schema.
AI Loyalty Engine Team:
Built a hybrid recommendation system combining LightGBM, SAR, and rule-based scoring.
Integrated business logic: margin and discount limits, stock and expiry tracking.
Delivered the API, real-time dashboard, and CSV export tools.
To-Do:
Integrate continuous data validation into ETL flow.
Create a separate table for detailed inventory data to support stock-level tracking.
Tune model weights and perform end-to-end integration testing.
Week 6 10/20/2025
Attendance: Samet GENC, Elif YILDIRIM, Ata GUNAYDIN, Salih OZGUVEN
Summary: This week marked significant progress in data refinement and feature integration across both teams. The Data Infrastructure group focused on extensive data cleaning to improve efficiency and model readiness, while the AI Loyalty Engine team implemented dynamic promotional logic and refined customer segmentation. The teams also collaborated to align datasets and ensure seamless system integration, paving the way for advanced analytical modeling in the coming weeks.
Accomplishments:
Data Infrastructure Team:
Performed large-scale data cleaning by removing deposit/fees, fuel purchases, age verification, non-merchandise items, and car wash entries.
Successfully reduced dataset size by more than half while maintaining data consistency and analytical depth.
Applied previously tested feature engineering and time-series transformations to the real dataset.
Implemented association rule mining using the Apriori algorithm to identify frequently co-purchased product combinations.
Conducted exploratory analysis to examine purchasing traffic before, during, and after holidays.
AI Loyalty Engine Team:
Integrated the system with the newly cleaned and standardized dataset from the Data Infrastructure team.
Handled missing values, unified data formats, and validated readiness for modeling.
Implemented holiday- and season-based promotional logic within the system.
Designed adaptive mechanisms for dynamically adjusting promotional campaigns based on time-sensitive factors such as holidays and seasonal demand.
Categorized customers based on engagement activity into segments including Active, At-Risk, and Churn.
To-Do:
Continue optimizing data quality and explore further feature correlations for deeper insights.
Fine-tune association rule thresholds and integrate top product combinations into campaign design.
Week 7 10/27/2025
Attendance: Samet GENC, Elif YILDIRIM, Ata GUNAYDIN, Salih OZGUVEN
Summary:
Accomplishments:
AI Loyalty Engine Team:
- Implemented and integrated new machine learning algorithms to enhance customer segmentation within Promos.
- The added models were trained on the existing datasets to improve segmentation accuracy and cluster interpretability.
- Within Promos, product names are now properly mapped and displayed, ensuring more consistent data representation and facilitating model evaluation.
- Preliminary tests indicate improved segmentation coherence and better alignment between predicted clusters and actual customer behavior patterns.
Data Infrastructure Team:
- Began running the developed data preprocessing and basket analysis programs on real and cleaned datasets.
- Gained new, accurate insights into product relationships within transactions.
- Identified and documented several areas in need of data correction and refinement for better modeling performance.
To-Do:
- Continue improving data pipelines and validate basket analysis findings.
- Provide updated datasets to the AI team for model retraining.
- Conduct detailed evaluation of the new segmentation models using validation metrics (e.g., silhouette score, cohesion).
Week 8 10/28/2025
Attendance: Samet GENC, Elif YILDIRIM, Ata GUNAYDIN, Salih OZGUVEN
Summary: This week, both teams advanced their respective components of the project, transitioning to real data analysis and refining documentation for customer analytics. The AI Loyalty Engine team focused on real-time behavior analysis and dynamic metric visualization, while the Data Infrastructure team ensured data integrity and expanded documentation to align with advanced analytical capabilities.
Accomplishments:
AI Loyalty Engine Team:
- Transitioned from mock data to **real data analysis** for improved reliability and insight.
- Developed a **dynamic ReactJS dashboard** for automatic metric updates and behavioral comparisons.
- Analyzed **Active, At-Risk, and Churn** customer groups to evaluate model adaptability and promotional impact.
- Implemented a **dynamic discount mechanism** and explored additional churn indicators beyond AOV to enhance customer retention understanding.
Data Infrastructure Team:
- Conducted in-depth sales analysis and identified irrelevant data connections that caused inaccurate correlations.
- Updated `README_transactions.md` to include the latest **analysis report**.
- Expanded **Business Applications** with **Customer Retention & Engagement** and **Marketing & Campaigns** sections.
- Updated **Technical Achievements** to include **customer analytics metrics**, ensuring stronger alignment between data insights and project documentation.
To-Do:
- Validate cleaned datasets and confirm alignment between sales and behavioral data.
- Integrate the refined data pipeline with the AI team’s dashboard for unified analysis.
- Finalize documentation for the next milestone presentation and cross-team demo.
Week 9 11/04/2025
Attendance: Samet GENC, Elif YILDIRIM, Ata GUNAYDIN, Salih OZGUVEN
Summary: This week, the AI Loyalty Engine team focused on improving the accuracy of business metrics in the analytics dashboard by fixing ROI and dataset comparison logic. The Data Infrastructure team built a new, consolidated analysis notebook (data_cleaning / comprehensive customer & basket analysis) that loads the February 2025 transaction data, runs all preprocessing, and performs end-to-end customer, discount, and basket analytics in one place. This will be used as the main source for customer-level insights and for feeding the AI team.
Accomplishments: AI Loyalty Engine Team:
- Corrected the ROI calculation so campaign performance reflects real discount cost vs. revenue.
- Fixed the comparison logic between different datasets so dashboard numbers don’t mismatch when inputs change.
- Ensured the dashboard updates dynamically when new/cleaned data is provided by the data team.
Data Infrastructure Team:
- Created a single consolidated notebook that:
- Loads and combines all combined transaction CSVs.
- Cleans numeric/date fields and standardizes loyalty/customer IDs.
- Adds time-based features (week, year–month, day/time-of-day).
- Performed customer-level analysis (unique customers, activity, AOV, inactivity, weekly/monthly trends).
- Built a full RFM-based customer segmentation (Champions, Loyal, At Risk, Lost, etc.) and tracked weekly segment transitions.
- Analyzed discount and promotion effectiveness (which discount levels bring best revenue per $1 discount, which segments respond more).
- Generated visualizations and an executive summary inside the notebook so other team members can review results quickly.
To-Do:
- Validate the consolidated notebook with newer months (not just February) to make it a reusable pipeline.
- Pass the cleaned/segmented output to the AI team so the dashboard can show segment-level promo performance.
- Integrate top association rules into campaign design (bundle offers / cross-sell suggestions).
Week 10 11/11/2025
Attendance: Samet GENC, Elif YILDIRIM, Ata GUNAYDIN, Salih OZGUVEN
Summary: This week, both teams continued building upon the February and March datasets to expand customer segmentation and improve analytical accuracy. The Data Infrastructure Team applied the comprehensive RFM segmentation process to March 2025 transaction data, generating detailed segment distributions and weekly transition trends. This work provides a foundation for tracking customer lifecycle movements and evaluating marketing effectiveness. Meanwhile, the AI Loyalty Engine Team collaborated with the data team to integrate the expanded customer segments into the dashboard. Segment transition data and behavioral trends were visualized, allowing dynamic tracking of customer movement between segments.
Accomplishments:
Data Infrastructure Team:
- Applied full RFM segmentation pipeline to March 2025 data.
- Generated detailed weekly and monthly segment transition analyses.
- Updated documentation explaining segment logic and business implications.
AI Loyalty Engine Team:
- Integrated new customer segments and transition metrics into the dashboard.
- Enhanced analysis view to visualize customer movement between segments.
- Synced dashboard data with the latest outputs from the data team.
To-Do:
- Validate segment consistency across February and March datasets.
- Finalize integration for real-time segment updates in the dashboard.
- Begin preparing a summary visualization for the mid-term report.