Dhruv Toshniwal

Search-Engine

April 2024

Search-Engine

Project Overview

In my latest project, I've developed a web scraping and AI-driven search engine system inspired by Perplexity AI, focusing on efficient data processing and intelligent web analysis.

Key Technical Highlights

Streamlining Data Processing

  • Refined text processing techniques
  • Improved algorithm speed
  • Enhanced text chunking capabilities

Innovative Database Approach

  • Implemented an in-code vector database
  • Deliberately avoided external databases to reduce latency
  • Prioritized direct and fast real-time processing

Performance Optimization

  • Utilized parallel processing
  • Designed for consistent speed across varying data loads
  • Aimed to maintain system responsiveness

Future Development Goals

  • Reduce API token usage
  • Explore data compression techniques
  • Implement potential distributed computing strategies

Technical Challenges

  • Managing massive web datasets
  • Maintaining data quality during compression
  • Balancing processing speed with comprehensive analysis

Project Resources

GitHub Repository: https://github.com/DhruvAjayToshniwal/Search-Engine

The project represents a strategic approach to creating a more efficient, AI-powered search solution with a focus on speed and intelligent data handling.