•Engineered 5 internal AI-powered systems (LLM-based recruitment tools, model inferencing services, speech-to-speech pipelines) to automate workflows, saving 90+ manual hours per week.
•Led development of an open-source AI coding assistance tool, enabling data sovereignty and generating $50K+ in client revenue through enterprise deployments.
•Implemented Retrieval Augmented Generation (RAG) pipelines using embeddings and vector-based retrieval, improving response relevance by 40% and achieving near-zero hallucination on knowledge-base queries.
•Designed open-source Inferencing as a Service (IaaS) architecture for private AI workloads, optimizing model serving and achieving 70% lower token costs.
•Built scripts for LLM and TTS concurrency testing, synthetic data generation, and performance benchmarking to evaluate throughput, latency, time-to-first token, and maximum concurrent users.