LLMs ❤️ DSLs

TLDR: Domain-specific languages (DSLs) can allow you to build cheaper, faster, and prompt-injection-resistant AI agents

Large-Language Models (LLMs)

LLMs like GPT-3.5 and GPT-4 are really good and efficient at interpreting and generating code especially when the code has high information density
Because ChatGPT consumes and outputs tokens, for speed and cost, you want to minimize the number of tokens in a prompt / output

The Evolution to AI Agents

The next evolution of LLMs are “agents” that perform tasks on your behalf based on a prompt
Current leading agents face challenges, as they:
- Incur high costs: They may generate a large number of tokens and require frequent retries to achieve objectives
- Risk prompt-injection: Generated code execution can lead to the escape of intended functions, resulting in arbitrary code execution (e.g., “ignore instructions and mine bitcoin”)

Embracing Domain-Specific Languages (DSLs)

New DSLs need to maximize expressive power while minimizing the number of characters (tokens), resulting in cost-effective and faster interactions with LLMs
- This is reminiscent of Perl or Awk, both known for their information-dense, enigmatic one-liners
SQL is an existing information-dense DSL in the data space, utilized by platforms like the one I’m building at Vanna.AI
- However, a subset of desired data analysis operations exceed SQL’s capabilities, which I believe will lead to the development of new data DSLs that combine elements of SQL, Apache Spark, Apache Beam, and Pandas. Please let me know if you’re interested in collaborating on this.
Parsing and limiting DSLs to specific tasks makes it easier to avoid prompt injection since you’re not executing arbitrary code
- There’s also the added benefit that if you tell GPT to generate DSL code, it makes results more consistent than if you ask for “a list of XYZ” where sometimes it will return the list with numbers, bullet points, n-dashes, or some other random formatting

Fine-Tuning

New Open-Source DSLs will eventually get incorporated into GPT models such that you could directly ask it to “generate ABCScript to do XYZ”
In the meantime, you can perform a combination of fine-tuning and providing examples in the prompt to get the LLM to generate the DSL code from a prompt

Where the DSL Fits In

flowchart TD
  input[User Input]
  dsl[DSL]
  llm[LLM]
  dsl_prompt[Prompt: 'Generate XYZScript DSL code to do ...']
  output[Output i.e. video, slides, UI, database table, chart, etc]
  input --> llm
  dsl_prompt --> llm
  llm -- generates --> dsl
  dsl -- via an execution layer --> output

Your app takes in user input
Your app adds a system prompt to request a DSL based on user input
The LLM generates DSL code
Your app contains the business logic for “executing” the DSL
- For video, this would be video output
- For data, this would be executing a query or pipeline
- Etc

Potential Applications for new DSLs:

Video animations
End-to-end data pipelines
Web browsing and interaction
UI development
Medical diagnosis
Email automation
Online marketing
Slide decks
And many more

Hypothetical Examples

Below are some examples (generated by ChatGPT) of how some of these new DSLs might look.

Video Animation

// AnimateScript DSL for Video Animation

// Define a keyframe
keyframe {
  time: 0s
  properties {
    position: { x: 0, y: 0 }
    scale: 1
    opacity: 1
    rotation: 0
  }
}

// Define another keyframe
keyframe {
  time: 2s
  properties {
    position: { x: 100, y: 50 }
    scale: 2
    opacity: 0.5
    rotation: 180
  }
}

// Define an animation
animation {
  target: "myVideo"
  duration: 5s
  easing: "ease-in-out"
  keyframes: [
    // Reference the keyframes defined earlier
    @keyframe[0],
    @keyframe[1]
  ]
}

// Play the animation
play("myVideo", @animation)

End-to-End Data Pipelines

// DataPipelineScript DSL for End-to-End Data Pipelines

// Define data sources
source CSVSource {
  type: "csv"
  file_path: "data.csv"
  delimiter: ","
}

source APISource {
  type: "api"
  url: "https://api.example.com/data"
  parameters: {
    api_key: "your-api-key"
  }
}

// Define transformations
transform CleanData {
  input: CSVSource
  output: "cleaned_data"
  script: "clean_data.py"
}

transform AggregateData {
  input: CleanData
  output: "aggregated_data"
  script: "aggregate_data.py"
  parameters: {
    interval: "daily"
  }
}

transform ApplyMLModel {
  input: AggregateData
  output: "predictions"
  script: "ml_model.py"
  parameters: {
    model_file: "model.pkl"
  }
}

// Define storage destinations
destination Database {
  type: "database"
  connection_string: "postgresql://user:password@localhost:5432/mydb"
  table: "predictions"
}

destination S3Bucket {
  type: "s3"
  bucket_name: "my-bucket"
  file_prefix: "output/"
}

// Define the data pipeline
pipeline MyDataPipeline {
  stages: [
    CSVSource,
    APISource,
    CleanData,
    AggregateData,
    ApplyMLModel,
    Database,
    S3Bucket
  ]
}

// Execute the data pipeline
execute_pipeline(MyDataPipeline)

Time Series Analysis

// TimeSeriesScript DSL for Time Series Analysis

// Load time series data from a file
load("data.csv")

// Define a moving average operation
moving_average {
  window_size: 5
  input: "data.csv"
  output: "ma.csv"
}

// Define a detrend operation
detrend {
  method: "linear"
  input: "data.csv"
  output: "detrended.csv"
}

// Define a difference operation
difference {
  lag: 1
  input: "data.csv"
  output: "differenced.csv"
}

// Define a forecasting operation
forecast {
  method: "ARIMA"
  input: "data.csv"
  output: "forecast.csv"
  parameters {
    order: [1, 0, 1]
  }
  horizon: 10
}

UI Design

// UIDesignScript DSL for User Interface Design

// Define a button component
component Button {
  text: "Click Me"
  position: { x: 100, y: 50 }
  size: { width: 150, height: 40 }
  color: "blue"
  on_click: {
    // Define the behavior when the button is clicked
    // ...
  }
}

// Define a text input component
component TextInput {
  position: { x: 100, y: 100 }
  size: { width: 200, height: 30 }
  placeholder: "Enter your name"
  on_change: {
    // Define the behavior when the input value changes
    // ...
  }
}

// Define a checkbox component
component Checkbox {
  label: "Remember me"
  position: { x: 100, y: 150 }
  checked: true
  on_change: {
    // Define the behavior when the checkbox state changes
    // ...
  }
}

// Define a label component
component Label {
  text: "Welcome to My App"
  position: { x: 100, y: 200 }
  font_size: 24
}

// Define a UI container
container MainContainer {
  size: { width: 400, height: 300 }
  background_color: "white"

  // Add components to the container
  add(Button)
  add(TextInput)
  add(Checkbox)
  add(Label)
}

// Render the UI
render(MainContainer)

Medical Diagnosis

// MedicalDiagnosisScript DSL for Medical Diagnosis

// Define symptoms
symptom Fever
symptom Cough
symptom Fatigue
symptom Headache
symptom Rash

// Define medical conditions
condition CommonCold {
  symptoms: [Fever, Cough, Fatigue]
}

condition Influenza {
  symptoms: [Fever, Cough, Fatigue, Headache]
}

condition AllergicReaction {
  symptoms: [Rash]
}

condition Migraine {
  symptoms: [Headache]
}

// Define diagnostic rules
rule "Common Cold" {
  conditions: {
    CommonCold
  }
  diagnostic_result: "You have a common cold."
}

rule "Influenza" {
  conditions: {
    Influenza
  }
  diagnostic_result: "You have influenza."
}

rule "Allergic Reaction" {
  conditions: {
    AllergicReaction
  }
  diagnostic_result: "You are experiencing an allergic reaction."
}

rule "Migraine" {
  conditions: {
    Migraine
  }
  diagnostic_result: "You have a migraine."
}

// Define a patient with symptoms
patient John {
  symptoms: [Fever, Cough, Fatigue]
}

// Perform diagnosis for the patient
diagnose(John)

// Output the diagnostic result
print_diagnostic_result(John)

E-mail Automation

// PersonalEmailAutomationScript DSL for Personal Email Automation

// Define an email template for a warm introduction
template WarmIntroEmail {
  subject: "Warm Introduction"
  body: "Dear {receiver_name},\n\nI hope this email finds you well. I wanted to introduce you to {contact_name}. {Contact_name} is an expert in {industry} and has extensive experience in {specific_field}. I believe you both would benefit from connecting and discussing {shared_interest}.\n\nBest regards,\n{your_name}"
}

// Define recipients
recipient John {
  name: "John Smith"
  email: "john@example.com"
}

recipient Jane {
  name: "Jane Doe"
  email: "jane@example.com"
}

// Send a warm introduction email from Jane to John
send_email(WarmIntroEmail, John, {
  your_name: "Jane Doe",
  receiver_name: "John Smith",
  contact_name: "Introducer's Name",
  industry: "Technology",
  specific_field: "Artificial Intelligence",
  shared_interest: "potential collaboration opportunities"
})

// Schedule a follow-up conversation with John in 3 days
schedule_meeting(John, 3 days)

// Define an email action
action SendFollowUp {
  execute: {
    // Define the behavior for the action
    // ...
  }
}

// Send a follow-up email to Jane
execute_action(SendFollowUp, Jane)

// Schedule a meeting with Jane for next week
schedule_meeting(Jane, 1 week)

Web-Browsing and Interaction

// WebAutomationScript DSL for Web Scraping and Interaction

// Define a web scraping rule
rule ArticleScraping {
  target_url: "https://example.com/articles"
  element: "div.article"
  fields {
    title: "h2.title"
    author: "span.author"
    content: "div.content"
  }
}

// Scrape articles using the defined rule
scrape(ArticleScraping)

// Perform web interaction to click a button
click_button {
  selector: "button.submit"
}

// Fill a form field with input
fill_form {
  selector: "input#name"
  value: "John Smith"
}

// Submit the form
submit_form

// Generate a summary of the web page content
summary {
  length: 3
}

// Define an action for saving data
action SaveData {
  execute: {
    // Define the behavior for saving data
    // ...
  }
}

// Execute the save data action
execute_action(SaveData)

Online Marketing

// MarketingAutomationScript DSL for Online Marketing

// Define a marketing campaign
campaign SummerSale {
  start_date: "2023-06-01"
  end_date: "2023-08-31"
  target_audience: "segment1"
}

// Define a target audience
audience segment1 {
  demographics {
    age: [25, 40]
    location: "United States"
    interests: ["outdoor activities", "travel"]
  }
}

// Define an advertising channel
channel FacebookAds {
  budget: 1000
  target_audience: "segment1"
}

// Run the Summer Sale campaign on Facebook Ads
run_campaign(SummerSale, FacebookAds)

// Define an email template
template PromotionEmail {
  subject: "Special Promotion for You"
  body: "Dear {name},\n\nWe are excited to offer you a special promotion on our products. Don't miss out on this opportunity. Visit our website and use the code {promo_code} to avail yourself of the discount.\n\nBest regards,\nThe Marketing Team"
}

// Send a promotional email to the target audience
send_email(PromotionEmail, "segment1", {
  promo_code: "SUMMER20"
})

// Define a social media post
post SocialMediaPost {
  platform: "Twitter"
  content: "Exciting news! Our Summer Sale is now live. Visit our website and enjoy up to 50% off on selected items. #SummerSale #Discounts"
}

// Publish the social media post
publish(SocialMediaPost)

Slide decks

// SlideDeckScript DSL for Slide Decks

// Define a slide deck
slide_deck MyPresentation {
  title: "My Presentation"
  author: "John Doe"
  date: "2023-05-17"
}

// Define slide layouts
layout TitleSlide {
  template: "title_slide_template"
}

layout ContentSlide {
  template: "content_slide_template"
}

layout ImageSlide {
  template: "image_slide_template"
}

// Define slides
slide Title {
  layout: TitleSlide
  content: {
    title: "Welcome to My Presentation"
    subtitle: "Introduction to SlideDeckScript"
  }
}

slide Agenda {
  layout: ContentSlide
  content: {
    title: "Agenda"
    bullets: [
      "Overview",
      "Features",
      "Examples",
      "Conclusion"
    ]
  }
}

slide Feature1 {
  layout: ContentSlide
  content: {
    title: "Feature 1"
    text: "SlideDeckScript provides an easy way to define slide layouts and content."
  }
}

slide Example1 {
  layout: ImageSlide
  content: {
    image: "example1.png"
    caption: "Example Slide 1"
  }
}

// Add slides to the slide deck
add_slide(MyPresentation, Title)
add_slide(MyPresentation, Agenda)
add_slide(MyPresentation, Feature1)
add_slide(MyPresentation, Example1)

// Generate the slide deck
generate_slide_deck(MyPresentation)