Vibe Codeyour next data science project
You bring the domain expertise; we amplify it - stripping away boilerplate so you can leap straight into hypothesis testing and actionable insight.
In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from prophet import Prophet
In [2]:
# Load the San Francisco housing data from S3
import boto3
s3 = boto3.client('s3')
obj = s3.get_object(
Bucket='housing-data',
Key='sf_housing_prices.csv'
)
df = pd.read_csv(obj['Body'])
In [3]:
# Preview the data
df.head()
date | neighborhood | bedrooms | sqft | price |
---|---|---|---|---|
2020-01-15 | Potrero Hill | 2 | 1250 | 1245000 |
2020-02-03 | Mission | 1 | 850 | 950000 |
2020-02-18 | Potrero Hill | 3 | 1850 | 1780000 |
In [4]:
# Filter for Potrero Hill neighborhood
potrero_data = df[df['neighborhood'] == 'Potrero Hill']
potrero_data.shape
(432, 5)
In [5]:
# Visualize price trends over time
potrero_data['date'] = pd.to_datetime(potrero_data['date'])
potrero_data.set_index('date', inplace=True)
monthly_avg = potrero_data.resample('M').mean()
plt.figure(figsize=(10, 6))
plt.plot(monthly_avg['price'])
plt.title('Potrero Hill Average Monthly Home Prices')
plt.ylabel('Price ($)')
plt.show()
In [6]:
agent = df.Agent(data=potrero_data)
agent.ask("Predict prices for Q3 2025")
Dualflow Assistant
How can I help with your data analysis today?
I've uploaded pricing data in San Francisco for the last 5 years in a CSV - can you take a look and predict the prices at Potrero Hill in Q3 2025?
I'll analyze the data and predict Potrero Hill prices for Q3 2025. Here's my plan:
- Data Exploration
I'll examine the dataset structure, check for missing values, and visualize price trends over the 5-year period.
- Feature Engineering
I'll extract seasonal patterns, identify key price drivers, and create neighborhood-specific variables for Potrero Hill.
- Model Training
I'll develop multiple forecasting models (ARIMA, XGBoost, and Prophet) and validate them with cross-validation.
- Future Projection
Using the best-performing model, I'll forecast Potrero Hill prices specifically for Q3 2025.
- Confidence Analysis
I'll provide prediction intervals and sensitivity analysis based on different economic scenarios.
Shall I proceed with this analysis?
Data Science Reimagined
This is data science re‑imagined: instead of wrangling CSVs, stitching notebooks, and babysitting pipelines, you'll iterate on features, spin up models, tune hyper‑parameters, and surface results—all in one conversation.
Autopilot, with Manual Override
Let the agent get you started, then grab the wheel whenever insight strikes. Dualflow exposes every cell, query, and script it generates, so you can inspect, tweak, and extend the code on your terms. No lock‑in, no black boxes—just a friction‑free hand‑off between conversational speed and full‑fidelity control.
Plug‑and‑Play Connectors
Point Dualflow at your source of truth — Amazon S3, BigQuery, Snowflake, Postgres, or any of a dozen others—and start querying in seconds. Secure, read‑only adapters map your schemas automatically, so you can skip the plumbing and dive straight into analysis. All connections honor least‑privilege access, stay encrypted in transit, and never copy data off your infrastructure—your information stays exactly where it lives.
Hot‑Swap Your LLM
Experiment freely without vendor lock‑in. Point Dualflow at GPT‑4o for deep reasoning, flip to Claude for compliance checks, or tap Gemini for multilingual tasks—all with a single toggle. Credentials stay in your vault, pipelines stay intact; only the underlying large‑language model changes, giving you instant A/B comparisons and the freedom to use the best brain for the job.
What Data Scientists Are Saying
Real voices from data scientists who’ve espoused Dualflow into their daily workflow.
"Dualflow has transformed how our team approaches data science projects. The conversation-based interface and ability to quickly iterate on models has cut our development time in half. What used to take weeks now takes days."
Dr. Sarah Chen
Lead Data Scientist at Quantum Analytics
"After years of wrangling messy notebooks and babysitting pipelines, Dualflow feels like a breath of fresh air. The agent-based approach means I can focus on solving problems instead of fighting with my tools. Game changer for our forecasting projects."
Marcus Ramirez, PhD
Senior ML Engineer at TechVision
Frequently Asked Questions
Pricing Plans
Choose the right plan for your data science needs
Free
Perfect for exploring and small projects
- 3 Notebooks
- 10 Agent Milestones
- Bring your own data (upload unlimited CSV files upto 1GB each)
- Uses Claude 3.5 Sonnet
- Public Github Repos
- Community support
Personal
For individuals with serious data needs
- 25 Notebooks per month
- 20 Agent Milestones per month, additional milestones billed at $.50 each
- Link to your data sources on AWS, GCP, Azure
- Switch between Claude 3.7 Sonnet, GPT-4o, Gemini 1.5 Pro
- Private Github Repos
- Email & Chat support
- Save & export entire workspace
Teams
For organizations and larger teams
- Unlimited notebooks
- 25 Agent Milestones per month, additional milestones billed at $.50 each
- Link to your data sources on AWS, GCP, Azure
- All models + priority access
- Enterprise Github Repos
- Dedicated account manager
- SSO & advanced security
- Training and onboarding
Need a custom solution?
Contact our sales team for enterprise-grade features, dedicated support, and custom integrations.
Contact Sales