Vibe Codeyour next data science project

You bring the domain expertise; we amplify it - stripping away boilerplate so you can leap straight into hypothesis testing and actionable insight.

Get Started

In [1]:

import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

import seaborn as sns

from sklearn.model_selection import train_test_split

from sklearn.ensemble import RandomForestRegressor

from prophet import Prophet

In [2]:

# Load the San Francisco housing data from S3

import boto3

s3 = boto3.client('s3')

obj = s3.get_object(

Bucket='housing-data',

Key='sf_housing_prices.csv'

)

df = pd.read_csv(obj['Body'])

In [3]:

# Preview the data

df.head()

date	neighborhood	bedrooms	sqft	price
2020-01-15	Potrero Hill	2	1250	1245000
2020-02-03	Mission	1	850	950000
2020-02-18	Potrero Hill	3	1850	1780000

In [4]:

# Filter for Potrero Hill neighborhood

potrero_data = df[df['neighborhood'] == 'Potrero Hill']

potrero_data.shape

(432, 5)

In [5]:

# Visualize price trends over time

potrero_data['date'] = pd.to_datetime(potrero_data['date'])

potrero_data.set_index('date', inplace=True)

monthly_avg = potrero_data.resample('M').mean()

plt.figure(figsize=(10, 6))

plt.plot(monthly_avg['price'])

plt.title('Potrero Hill Average Monthly Home Prices')

plt.ylabel('Price ($)')

plt.show()

In [6]:

agent = df.Agent(data=potrero_data)

agent.ask("Predict prices for Q3 2025")

Dualflow Assistant

How can I help with your data analysis today?

I've uploaded pricing data in San Francisco for the last 5 years in a CSV - can you take a look and predict the prices at Potrero Hill in Q3 2025?

I'll analyze the data and predict Potrero Hill prices for Q3 2025. Here's my plan:

Data Exploration
I'll examine the dataset structure, check for missing values, and visualize price trends over the 5-year period.
Feature Engineering
I'll extract seasonal patterns, identify key price drivers, and create neighborhood-specific variables for Potrero Hill.
Model Training
I'll develop multiple forecasting models (ARIMA, XGBoost, and Prophet) and validate them with cross-validation.
Future Projection
Using the best-performing model, I'll forecast Potrero Hill prices specifically for Q3 2025.
Confidence Analysis
I'll provide prediction intervals and sensitivity analysis based on different economic scenarios.

Shall I proceed with this analysis?

Data Science Reimagined

This is data science re‑imagined: instead of wrangling CSVs, stitching notebooks, and babysitting pipelines, you'll iterate on features, spin up models, tune hyper‑parameters, and surface results—all in one conversation.

Autopilot, with Manual Override

Let the agent get you started, then grab the wheel whenever insight strikes. Dualflow exposes every cell, query, and script it generates, so you can inspect, tweak, and extend the code on your terms. No lock‑in, no black boxes—just a friction‑free hand‑off between conversational speed and full‑fidelity control.

Plug‑and‑Play Connectors

Point Dualflow at your source of truth — Amazon S3, BigQuery, Snowflake, Postgres, or any of a dozen others—and start querying in seconds. Secure, read‑only adapters map your schemas automatically, so you can skip the plumbing and dive straight into analysis. All connections honor least‑privilege access, stay encrypted in transit, and never copy data off your infrastructure—your information stays exactly where it lives.

Hot‑Swap Your LLM

Experiment freely without vendor lock‑in. Point Dualflow at GPT‑4o for deep reasoning, flip to Claude for compliance checks, or tap Gemini for multilingual tasks—all with a single toggle. Credentials stay in your vault, pipelines stay intact; only the underlying large‑language model changes, giving you instant A/B comparisons and the freedom to use the best brain for the job.

What Data Scientists Are Saying

Real voices from data scientists who've espoused Dualflow into their daily workflow.

"Dualflow has transformed how our team approaches data science projects. The conversation-based interface and ability to quickly iterate on models has cut our development time in half. What used to take weeks now takes days."

Dr. Sarah Chen

Lead Data Scientist at Quantum Analytics

"After years of wrangling messy notebooks and babysitting pipelines, Dualflow feels like a breath of fresh air. The agent-based approach means I can focus on solving problems instead of fighting with my tools. Game changer for our forecasting projects."

Marcus Ramirez, PhD

Senior ML Engineer at TechVision

Frequently Asked Questions

Pricing Plans

Choose the right plan for your data science needs

Free

Perfect for exploring and small projects

3 Notebooks
10 Agent Milestones
Bring your own data (upload unlimited CSV files upto 1GB each)
Uses Claude 3.5 Sonnet
Public Github Repos
Community support

Get Started

Personal

$19per month

For individuals with serious data needs

25 Notebooks per month
20 Agent Milestones per month, additional milestones billed at $.50 each
Link to your data sources on AWS, GCP, Azure
Switch between Claude 3.7 Sonnet, GPT-4o, Gemini 1.5 Pro
Private Github Repos
Email & Chat support
Save & export entire workspace

Teams

$49per month

For organizations and larger teams

Unlimited notebooks
25 Agent Milestones per month, additional milestones billed at $.50 each
Link to your data sources on AWS, GCP, Azure
All models + priority access
Enterprise Github Repos
Dedicated account manager
SSO & advanced security
Training and onboarding

Contact Sales

Need a custom solution?

Contact our sales team for enterprise-grade features, dedicated support, and custom integrations.