Vibe Codeyour next data science project

You bring the domain expertise; we amplify it - stripping away boilerplate so you can leap straight into hypothesis testing and actionable insight.

In [1]:

import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

import seaborn as sns

from sklearn.model_selection import train_test_split

from sklearn.ensemble import RandomForestRegressor

from prophet import Prophet

In [2]:

# Load the San Francisco housing data from S3

import boto3

s3 = boto3.client('s3')

obj = s3.get_object(

Bucket='housing-data',

Key='sf_housing_prices.csv'

)

df = pd.read_csv(obj['Body'])

In [3]:

# Preview the data

df.head()

dateneighborhoodbedroomssqftprice
2020-01-15Potrero Hill212501245000
2020-02-03Mission1850950000
2020-02-18Potrero Hill318501780000

In [4]:

# Filter for Potrero Hill neighborhood

potrero_data = df[df['neighborhood'] == 'Potrero Hill']

potrero_data.shape

(432, 5)

In [5]:

# Visualize price trends over time

potrero_data['date'] = pd.to_datetime(potrero_data['date'])

potrero_data.set_index('date', inplace=True)

monthly_avg = potrero_data.resample('M').mean()

plt.figure(figsize=(10, 6))

plt.plot(monthly_avg['price'])

plt.title('Potrero Hill Average Monthly Home Prices')

plt.ylabel('Price ($)')

plt.show()

In [6]:

agent = df.Agent(data=potrero_data)

agent.ask("Predict prices for Q3 2025")

Dualflow Assistant

AI

How can I help with your data analysis today?

U

I've uploaded pricing data in San Francisco for the last 5 years in a CSV - can you take a look and predict the prices at Potrero Hill in Q3 2025?

AI

I'll analyze the data and predict Potrero Hill prices for Q3 2025. Here's my plan:

  1. Data Exploration

    I'll examine the dataset structure, check for missing values, and visualize price trends over the 5-year period.

  2. Feature Engineering

    I'll extract seasonal patterns, identify key price drivers, and create neighborhood-specific variables for Potrero Hill.

  3. Model Training

    I'll develop multiple forecasting models (ARIMA, XGBoost, and Prophet) and validate them with cross-validation.

  4. Future Projection

    Using the best-performing model, I'll forecast Potrero Hill prices specifically for Q3 2025.

  5. Confidence Analysis

    I'll provide prediction intervals and sensitivity analysis based on different economic scenarios.

Shall I proceed with this analysis?

Data Science Reimagined

This is data science re‑imagined: instead of wrangling CSVs, stitching notebooks, and babysitting pipelines, you'll iterate on features, spin up models, tune hyper‑parameters, and surface results—all in one conversation.

Autopilot, with Manual Override

Let the agent get you started, then grab the wheel whenever insight strikes. Dualflow exposes every cell, query, and script it generates, so you can inspect, tweak, and extend the code on your terms. No lock‑in, no black boxes—just a friction‑free hand‑off between conversational speed and full‑fidelity control.

Plug‑and‑Play Connectors

Point Dualflow at your source of truth — Amazon S3, BigQuery, Snowflake, Postgres, or any of a dozen others—and start querying in seconds. Secure, read‑only adapters map your schemas automatically, so you can skip the plumbing and dive straight into analysis. All connections honor least‑privilege access, stay encrypted in transit, and never copy data off your infrastructure—your information stays exactly where it lives.

Hot‑Swap Your LLM

Experiment freely without vendor lock‑in. Point Dualflow at GPT‑4o for deep reasoning, flip to Claude for compliance checks, or tap Gemini for multilingual tasks—all with a single toggle. Credentials stay in your vault, pipelines stay intact; only the underlying large‑language model changes, giving you instant A/B comparisons and the freedom to use the best brain for the job.

What Data Scientists Are Saying

Real voices from data scientists who’ve espoused Dualflow into their daily workflow.

"Dualflow has transformed how our team approaches data science projects. The conversation-based interface and ability to quickly iterate on models has cut our development time in half. What used to take weeks now takes days."

Dr. Sarah Chen

Lead Data Scientist at Quantum Analytics

"After years of wrangling messy notebooks and babysitting pipelines, Dualflow feels like a breath of fresh air. The agent-based approach means I can focus on solving problems instead of fighting with my tools. Game changer for our forecasting projects."

Marcus Ramirez, PhD

Senior ML Engineer at TechVision

Frequently Asked Questions

Pricing Plans

Choose the right plan for your data science needs

Free

Free

Perfect for exploring and small projects

  • 3 Notebooks
  • 10 Agent Milestones
  • Bring your own data (upload unlimited CSV files upto 1GB each)
  • Uses Claude 3.5 Sonnet
  • Public Github Repos
  • Community support
MOST POPULAR

Personal

$19per month

For individuals with serious data needs

  • 25 Notebooks per month
  • 20 Agent Milestones per month, additional milestones billed at $.50 each
  • Link to your data sources on AWS, GCP, Azure
  • Switch between Claude 3.7 Sonnet, GPT-4o, Gemini 1.5 Pro
  • Private Github Repos
  • Email & Chat support
  • Save & export entire workspace

Teams

$49per month

For organizations and larger teams

  • Unlimited notebooks
  • 25 Agent Milestones per month, additional milestones billed at $.50 each
  • Link to your data sources on AWS, GCP, Azure
  • All models + priority access
  • Enterprise Github Repos
  • Dedicated account manager
  • SSO & advanced security
  • Training and onboarding

Need a custom solution?

Contact our sales team for enterprise-grade features, dedicated support, and custom integrations.

Contact Sales