My AI Sidekick
A personal assistant that uses LLM, Memory Systems, and advanced AI to create a system that predicts user needs while maintaining privacy

My AI Sidekick in action: orchestrating my digital life
Introduction: The Why
In today's digital landscape, personal assistants have become essential tools for managing our daily lives. However, many existing solutions face challenges in maintaining context, personalization, and privacy. This project aims to address these challenges by creating a voice-based personal assistant that truly understands and adapts to individual needs.
The goal is to build a system that learns user preferences, habits, and needs through natural interaction, creating a more intuitive and helpful experience.
This project combines technical innovation with practical utility, creating a system that not only demonstrates advanced AI capabilities but also serves as a valuable productivity tool.
Here's how we created a voice-based personal assistant that understands user needs while maintaining privacy and control over personal data.
The Assistant That Truly Gets Me
At its core, my assistant is a voice-powered sidekick that orchestrates my digital life. It controls Spotify, manages my Notion tasks, checks Gmail, handles my Calendar, and even gives me a gentle nudge when I'm procrastinating.
The key differentiator is its ability to build genuine context about user preferences over time:
- Voice interaction: Natural conversation with a system that remembers past interactions
- Multi-service integration: A single interface for productivity tools
- Personalized memory: Builds understanding of user preferences and habits over time
- Task management: Provides timely, contextual reminders for pending tasks
The system demonstrates its capabilities through practical applications. For example, when scheduling meetings, it not only sets up the appointment but also suggests preparing relevant materials based on past patterns. This isn't following a script—it's learning from user behavior.
When playing music, it adapts to user preferences during different activities, such as focus music during work sessions or more energetic tracks during coding sprints. These aren't hard-coded rules; they emerge from the system learning user habits.
The Technical Implementation
Core Components
The system is built using a stack that balances power with practicality:
- Brain: Google's Gemini 2.0 Flash handles the natural language understanding, with Llama as an alternative backend
- Orchestration: Langgraph manages conversation flows and tool execution
- Voice: Whisper for speech-to-text and Kokoro TTS for natural-sounding responses
- Integrations: API connections to Spotify, Notion, Gmail, Google Calendar, Twitter, and web search capabilities
- Memory: Custom-built hierarchical memory system for context persistence

High-level architecture showing the flow from speech input to tool execution
The Memory System: My Secret Sauce
The most powerful component—and what separates my assistant from big tech alternatives—is the memory system. While commercial assistants either forget our conversations immediately or store them in the cloud for "training" (and ad targeting), my system builds a multifaceted understanding:
The memory system doesn't just passively record—it actively extracts insights and provides them as context for future interactions:
- Episodic Memory: Stores specific interactions with timestamps, like "You asked about jazz music last Tuesday when you were working on the project report"
- Semantic Memory: Extracts general facts and preferences over time, like "You prefer lo-fi music while coding" or "You usually forget to prepare for advisory meetings"
- Procedural Memory: Learns patterns about tool usage and workflows, like "You typically check email after finishing Notion tasks"
The Procrastination Checker: My Digital Conscience
Another unique feature is the procrastination detection system. Every 5 minutes, a background thread:
- Retrieves my current priority tasks from Notion
- Compares them against what I'm currently focusing on
- If there's a significant mismatch, generates a contextual reminder
- Delivers it via voice, referencing specific tasks that need attention
The key insight was making these reminders smart—not just generic "get back to work" messages. When it detects procrastination, it evaluates both the importance and urgency of pending tasks, then frames the reminder in a way that acknowledges the difficulty of the task while providing a small, actionable next step.
The Vision: Where This Is Going
While the current implementation is robust, there are several areas for future enhancement:
Planned Memory Architecture Enhancements
The most significant planned improvements focus on the memory system:
- Memory Links: Creating associative connections between related memories, allowing the system to traverse a knowledge graph about user preferences
- Memory Decay: Implementing importance-weighted forgetting algorithms that mimic human memory patterns
- Hierarchical Compression: Using autoencoders to compress routine memories into higher-level concepts
Future AI Integration
Planned enhancements include:
- Predictive Suggestions: Using transformer models to anticipate needs based on time, context, and past behavior
- Habit Formation: Applying reinforcement learning to help build positive habits
- Multimodal Understanding: Integrating vision models to understand the user's environment
- Emotion Recognition: Training models on voice patterns to detect stress or fatigue
The system's architecture is designed to scale to multiple users, each with their own private memory system and preference models. This approach ensures personalized assistance while maintaining user privacy.
Lessons Learned
Implementation Challenges
Key technical challenges and solutions:
- Voice Latency: Optimized the pipeline by running audio processing in separate threads and implementing efficient memory retrieval
- Context Window Limitations: Implemented relevance-based filtering to surface only the most pertinent memories
- Tool Integration Complexity: Developed robust error handling and authentication flows for third-party services
- Memory Extraction: Implemented sophisticated classification system for semantic memory storage
Personal Growth
This project taught me more than just technical skills:
- The Power of Context: I was stunned by how much more helpful an assistant becomes when it truly remembers past interactions. Memory isn't just a nice-to-have; it's the essence of intelligent assistance.
- Privacy by Design: Building privacy-respecting technology isn't just ethically sound—it results in better user experiences. When the assistant isn't trying to extract data for other purposes, it can focus entirely on serving the user.
- Integration > Isolation: The most valuable features emerged at the intersections of different tools—like using Notion task deadlines to inform procrastination detection.
- Voice UX Complexity: Voice interfaces introduce unique challenges around feedback, confirmation, and error recovery. I developed a new appreciation for thoughtful voice interaction design.
On a personal level, this project shifted my perspective from "building cool tech" to "solving human problems." That transformation changed how I view my career path—I'm now focused on creating technology that genuinely improves lives rather than just showcasing technical prowess.
Results & Impact
The impact on my daily productivity has been substantial:
- Time Efficiency: Approximately 45 minutes daily saved through streamlined tool interactions
- Task Completion: 15% improvement in on-time task completion
- Focus Duration: 50% increase in average focus session length
What's Next
I'm continuing to enhance the system along several dimensions:
- Deep Learning Integration: Implementing transformer models and other deep learning techniques for enhanced personalization and prediction
- Reinforcement Learning: Adding RL capabilities for adaptive task optimization and habit formation
- Expanded Tool Integration: Adding more services and deeper integration capabilities
- Offline Capabilities: Reducing dependency on cloud services for better privacy and reliability
- Multi-User Support: Adapting the architecture to support multiple users while maintaining personal privacy
Join the Conversation
What would you want in a truly personal AI assistant? What features would make your digital life better? I'd love to hear your thoughts!