Skip to content

A C# console app that uses HtmlAgilityPack and XPath to scrape job listings, extracts details (title, company, location, description), summarizes them via GPT, and even generates personalized cover letters.

License

Notifications You must be signed in to change notification settings

dengaertig/SmartApply

Repository files navigation

JobScraper-CoverLetterGen

A C# console utility that scrapes job postings from StepStone, extracts key metadata (company, location, post date), and uses OpenAI’s GPT API to draft customized cover letters.

🚀 Features

  • Web scraping with [HtmlAgilityPack]
  • HTML parsing to extract company name, location, posting date, and job title
  • OpenAI GPT integration to generate personalized cover‐letter drafts
  • Configurable via environment variables for API keys and target URLs
  • Extensible—easily adapt to other job boards or add e-mail notifications

🛠️ Technologies & Topics

  • Language: C#
  • Libraries: HtmlAgilityPack, RestSharp (or HttpClient), Newtonsoft.Json
  • APIs: OpenAI GPT
  • GitHub Topics:
    csharphtml-agility-packweb-scrapingopenaijob-scraper

🔧 Prerequisites

⚙️ Installation & Setup

  1. Clone the repo

    git clone https://github.com/your-username/JobScraper-CoverLetterGen.git
    cd JobScraper-CoverLetterGen
  2. Configure your environment
    Create a .env file in the project root:

    OPENAI_API_KEY=sk-YOUR_OPENAI_KEY
    STEPSTONE_USERNAME=your.email@example.com
    STEPSTONE_PASSWORD=YourPassword123
    JOB_LISTING_URL=https://www.stepstone.de/jobs/werkstudent-informatik/in-berlin
  3. Restore & build

    dotnet restore
    dotnet build --configuration Release

▶️ Usage

Run the scraper and cover-letter generator in one command:

dotnet run --project src/JobScraper-CoverLetterGen

By default it will:

  1. Fetch the target StepStone page
  2. Parse the first 5 job postings
  3. Print each company, location, post date, and a GPT-generated cover letter draft to the console

Command-Line Options

  --url <URL>             Override the default job-listing URL
  --max-jobs <number>     Limit to N postings (default: 5)
  --output <path>         Save results (JSON) to file

Example:

dotnet run --url https://www.stepstone.de/jobs/frontend-developer --max-jobs 3

📦 Docker

  1. Build
    docker build -t jobscraper .
  2. Run
    docker run --rm \
      -e OPENAI_API_KEY="$OPENAI_API_KEY" \
      -e JOB_LISTING_URL="https://www.stepstone.de/jobs/..." \
      jobscraper

📝 How It Works

  1. HTML fetch via HttpClient
  2. Parsing with HtmlAgilityPack:
    • Select each <div data-genesis-element="BASE"> representing a job card
    • Extract company name, location, and “online date” text
  3. Prompt construction:
    “Write me a concise cover letter for a Werkstudent Informatik position at {Company} in {Location}, posted {Date}…”  
    
  4. API call to OpenAI GPT for each job
  5. Output to console or JSON file

🤝 Contributing

  1. Fork the repo
  2. Create your feature branch (git checkout -b feature/your-feature)
  3. Commit your changes (git commit -m 'Add some feature')
  4. Push to the branch (git push origin feature/your-feature)
  5. Open a Pull Request

🛡️ License

This project is licensed under the MIT License. See LICENSE for details.


⚠️ Disclaimer

This project is intended for educational and personal use only.

This scraper is not affiliated with, endorsed by, or in any way connected to StepStone GmbH or its affiliates.

It uses publicly available web pages and does not bypass any authentication or access controls.

You are responsible for complying with the Terms of Service of StepStone, as well as any applicable laws and regulations.

The author assumes no liability for any misuse of this code, including but not limited to:

    Accessing data you are not authorized to view

    Overloading StepStone’s servers

    Violating any terms of service or legal restrictions

Use this project at your own risk. All warranties and liabilities are disclaimed to the fullest extent permitted by law.

Happy scraping & good luck with your next application!

About

A C# console app that uses HtmlAgilityPack and XPath to scrape job listings, extracts details (title, company, location, description), summarizes them via GPT, and even generates personalized cover letters.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published