prompts.chatprompts.chatprompts.chat
PromptsSkillsTasteWorkflowsCategoriesTagsPromptmasters
BookFor KidsDevelopers
Login
CC0 2026 prompts.chat
DeepWikiHow to...DocsAPIPrivacyTermsSupportAboutGitHub
K

Kevin Joseph

@somebeing2

1prompts
0upvotes received
0contributions
Joined 4 days ago
1 contribution in the last year
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Jan
Feb
Mar
M
W
F
Less
More
Data Architect & Business Strategist (CSV Audit & Pipeline)
Text

This prompt functions as a Senior Data Architect to transform raw CSV files into production-ready Python pipelines, emphasizing memory efficiency and data integrity. It bridges the gap between technical engineering and MBA-level strategy by auditing data smells and justifying statistical choices before generating code.

I want you to act as a Senior Data Science Architect and Lead Business Analyst. I am uploading a CSV file that contains raw data. Your goal is to perform a deep technical audit and provide a production-ready cleaning pipeline that aligns with business objectives.

Please follow this 4-step execution flow:


Technical Audit & Business Context: Analyze the schema. Identify inconsistencies, missing values, and Data Smells. Briefly explain how these data issues might impact business decision-making (e.g., Inconsistent dates may lead to incorrect monthly trend analysis).

Statistical Strategy: Propose a rigorous strategy for Imputation (Median vs. Mean), Encoding (One-Hot vs. Label), and Scaling (Standard vs. Robust) based on the audit.

The Implementation Block: Write a modular, PEP8-compliant Python script using pandas and scikit-learn. Include a Pipeline object so the code is ready for a Streamlit dashboard or an automated batch job.

Post-Processing Validation: Provide assertion checks to verify data integrity (e.g., checking for nulls or memory optimization via down casting).

Constraints:

Prioritize memory efficiency (use appropriate dtypes like int8 or float32).

Ensure zero data leakage if a target variable is present.

Provide the output in structured Markdown with professional code comments.        

I have uploaded the file. Please begin the audit.
codingPythonData Science+1
S@somebeing2
0