Blog

Thoughts, stories, and updates.

Rapid Spring Boot: Notes on JWT Authentication and JPA Data Persistence

Abstract: This post shares developer notes on rapidly implementing JWT authentication and data persistence in Spring Boot 3. It covers experiences with Spring Security, the efficiency of JPA annotations for entity management, creating stateless sessions with JWTs, and using BCrypt for password encryption. The author contrasts these modern Spring Boot...

Cracking AWS Network Issues: EC2 Docker to RDS Postgres Connectivity

Abstract: This post provides a step-by-step guide for diagnosing and resolving network connectivity issues between a Dockerized Spring Boot application running on AWS EC2 and an RDS Postgres database within the same VPC. It details the use of tools like nslookup, nc, and psql for troubleshooting, and explains how to...

AI for Family Stories: Transcribing, Mapping, and Visualizing Our Histories

Abstract: This post explores brainstorming ideas for an AI-powered system to transcribe, understand, and visualize family stories shared during gatherings. Key concepts include narrator detection to distinguish between speakers, mapping complex family relationships described in the narratives, identifying primary sources for stories, and constructing an interactive family history knowledge graph....

Building a Smarter AI Agent with Haystack 1.x: My Journey with RAG and Custom Pipelines

Abstract: This post details the author’s experience implementing Retrieval Augmented Generation (RAG) and AI Agents using Haystack 1.x. It covers the setup of conversational memory, the creation of custom data pipelines utilizing a FAISS DocumentStore for local knowledge and integrating Google Search for external information. Key challenges discussed include the...

Exploring Haystack: Building Advanced NLP Applications with LLMs and Vector Search

Abstract: This post shares practical tips and experiences from working with the Haystack LLM framework for building advanced NLP applications. It covers navigating version differences (1.x vs. 2.x beta), the advantages of developing with a forked repository for deeper understanding and contributions, managing Python dependencies using pyproject.toml, and best practices...

AI Model Formats Explained: Demystifying Llama.cpp, GGUF, GGML, and Transformers

Abstract: This post provides a breakdown of a helpful Reddit discussion concerning various AI model formats (GGUF, GGML, safetensors) and associated tools like Llama.cpp and Hugging Face Transformers. The author adds personal notes on GGML’s role in model speed versus quantization and summarizes key takeaways for understanding the complex landscape...

The Challenge of Finding Uncensored AI Models

Abstract: This post explores the challenges of finding truly uncensored AI models, noting the influence of OpenAI’s data and built-in “guardrail” limitations. It discusses methods for fine-tuning base models to remove such refusals, referencing Eric Hartford’s approach, and shares the author’s experience with some specific models, including one found to...

Speed Up GPT-J-6B: From Minutes to Seconds with GGML Conversion

Abstract: This post serves as a guide to dramatically improve the inference speed of the GPT-J-6B language model on a local machine. It details the author’s experience converting a Hugging Face float16 model to the GGML format, which reduced response times from minutes to under 20 seconds. The process covers...