My AI Sidekick

A personal assistant that blends LLMs, hierarchical memory, and voice interfaces to anticipate needs while keeping data private.

March 30, 2025 10 min read

AI Assistant Voice Interface LLM Memory Systems Personal Project

Orchestrating my digital life with a memory-aware voice assistant.

Why I Built It

Most assistants feel transactional: they forget context, they live in someone else's cloud, and they never feel mine. This project explores what happens when you design an assistant around memory, privacy, and voice-first UX.

“The vision: an ambient AI sidekick that remembers what matters, anticipates routines, and respects personal data boundaries.”

What the Assistant Does Today

Voice-first orchestration: talk to the assistant and it executes across Spotify, Notion, Gmail, Calendar, Twitter, and custom tools.
Contextual reminders: memory-driven nudges that reference specific tasks and past conversations.
Adaptive media: changing music or routines based on the time of day, current tasks, and historical preference.
Privacy controls: all memory storage is local-first with granular deletion and audit trails.

Instead of scripting behaviors, the assistant learns patterns—like preparing meeting briefs ahead of time or switching to focus playlists during coding blocks.

Technical Architecture

High-level flow from speech input to tool execution.

Language core: Gemini 2.0 Flash (primary) with Llama fallback.
Orchestration: LangGraph agents drive tool selection, error recovery, and conversational state.
Voice I/O: Whisper for transcription; Kokoro TTS for natural responses.
Integrations: Spotify, Notion, Gmail, Google Calendar, Twitter, web search.
Memory substrate: Custom hierarchical store across episodic, semantic, and procedural layers.

Why memory matters

Unlike assistants that either forget everything or log conversations to opaque clouds, this system builds a local-first knowledge graph:

Episodic: time-stamped snapshots of key interactions (“Asked for jazz playlists before Tuesday advisory meeting”).
Semantic: distilled preferences (“Prefers lo-fi during coding sprints”).
Procedural: sequences and routines (“Checks inbox after updating Notion boards”).

Procrastination guardian

A background worker polls Notion every five minutes, compares tasks with current focus, and delivers voice prompts only when there’s a meaningful gap. Reminders acknowledge the context (“You planned to prep the advisory deck—want to start with the outline together?”).

The Vision: Where This Is Going

While the current implementation is robust, there are several areas for future enhancement:

Planned Memory Architecture Enhancements

The most significant planned improvements focus on the memory system:

Memory Links: Creating associative connections between related memories, allowing the system to traverse a knowledge graph about user preferences
Memory Decay: Implementing importance-weighted forgetting algorithms that mimic human memory patterns
Hierarchical Compression: Using autoencoders to compress routine memories into higher-level concepts

Future AI Integration

Planned enhancements include:

Predictive Suggestions: Using transformer models to anticipate needs based on time, context, and past behavior
Habit Formation: Applying reinforcement learning to help build positive habits
Multimodal Understanding: Integrating vision models to understand the user's environment
Emotion Recognition: Training models on voice patterns to detect stress or fatigue

The system's architecture is designed to scale to multiple users, each with their own private memory system and preference models. This approach ensures personalized assistance while maintaining user privacy.

Lessons Learned

Implementation Challenges

Key technical challenges and solutions:

Voice Latency: Optimized the pipeline by running audio processing in separate threads and implementing efficient memory retrieval
Context Window Limitations: Implemented relevance-based filtering to surface only the most pertinent memories
Tool Integration Complexity: Developed robust error handling and authentication flows for third-party services
Memory Extraction: Implemented sophisticated classification system for semantic memory storage

Personal Growth

This project taught me more than just technical skills:

The Power of Context: I was stunned by how much more helpful an assistant becomes when it truly remembers past interactions. Memory isn't just a nice-to-have; it's the essence of intelligent assistance.
Privacy by Design: Building privacy-respecting technology isn't just ethically sound—it results in better user experiences. When the assistant isn't trying to extract data for other purposes, it can focus entirely on serving the user.
Integration > Isolation: The most valuable features emerged at the intersections of different tools—like using Notion task deadlines to inform procrastination detection.
Voice UX Complexity: Voice interfaces introduce unique challenges around feedback, confirmation, and error recovery. I developed a new appreciation for thoughtful voice interaction design.

On a personal level, this project shifted my perspective from "building cool tech" to "solving human problems." That transformation changed how I view my career path—I'm now focused on creating technology that genuinely improves lives rather than just showcasing technical prowess.

Results & Impact

The impact on my daily productivity has been substantial:

Time Efficiency: Approximately 45 minutes daily saved through streamlined tool interactions
Task Completion: 15% improvement in on-time task completion
Focus Duration: 50% increase in average focus session length

What's Next

I'm continuing to enhance the system along several dimensions:

Deep Learning Integration: Implementing transformer models and other deep learning techniques for enhanced personalization and prediction
Reinforcement Learning: Adding RL capabilities for adaptive task optimization and habit formation
Expanded Tool Integration: Adding more services and deeper integration capabilities
Offline Capabilities: Reducing dependency on cloud services for better privacy and reliability
Multi-User Support: Adapting the architecture to support multiple users while maintaining personal privacy

Join the Conversation

What would you want in a truly personal AI assistant? What features would make your digital life better? I'd love to hear your thoughts!