Lazy Hacker Babble

Monday, July 7, 2025

Drill Press Table

After many, many years, I finally got around to building a drill press table. The drill press was a tool that I thought that I'd use frequently when I bought it but I found myself generally avoiding it instead. I think it is because of the setup that was needed whenever I needed to use it particularly all the clamps that ended up getting used and trying to balance the work piece on the small round cast iron table that it comes with.

The dimensions of the drill press table is 12" deep by 24" wide. The thickness is 2 sheets of 3/4" Baltic birch plywood since that's what I had on hand laminated together so there is enough thickness to allow for putting in the T-tracks used by the fence and hold-down clamps. The fence 2 1/2" tall and is made from the cutoffs that I had from the Baltic birch.

Two thing missing that is common on many DIY tables are tracks on the fence to allow for a stop block and having a replaceable sacrificial top area are that gets drilled in. I didn't have any extra T-Tracks for the stop block and I usually just use a scrap piece of wood and a clamp when I need one. I plan to add the sacrificial piece when I chew into the current top to see what areas that would normally chew up.

For now, I'll see if having the table will make me use it more often.

Wednesday, June 18, 2025

My Contribution to LLM Knowledge?

When I first started to learn Go many years ago, I wrote a Wator simulation with Go and uploaded the code to a Github repo. A few months ago, I wanted to learn how to write a 2D game with graphics so I decided to do a new implementation of Wator using Go and Ebitengine.

There was a bug that I got stuck on and because LLMs are all the rage, I decided to ask Google's Gemini 1.5 to assist me. Gemini generated some code output in response to my request but as I was looking at what it produced, it seemed awfully familiar. I went back and looked at my old Wator code and sure enough it was very similar (including a bug that was in the original code). When I asked Gemini for some of its sources for coming up with the answer it listed my repo as one of them.

Basically, I asked an LLM a question and it responded with my own answer. There weren't many examples of Wator in Go so it wasn't because I had written something great, but I still found it neat and funny this happened.

Wednesday, April 23, 2025

Why Big Companies Move Slow

Big companies have resources, talent, and market reach, but compared to small start-ups with less resources, talent and reach they often seem to be slow and lumbering. Is this the inevitable result of becoming a large company? Many companies are slow but in a different sense from the big-versus-small comparison. Large company velocity takes into consideration "scale" and can still move nimbly and quickly within that context, but the complexity of scale often overwhelms the organization resulting in them becoming a slow and lumbering company.

Scale

To be fair to large companies, comparing launching something between a start-up and a large company isn't an apples-to-apples comparison because there's often a lot additional requirements placed on a companies that has reach a certain scale. Some of these requirements come externally (e.g. government regulations) and some internally (existing user base, supporting existing infrastructure and use cases).

It isn't lost on the employees how cumbersome internal requirements can be:

Having had to work on these systems, there are often valid reasons behind the requirements, but it can still be hair-pulling frustrating sometimes to have to consider all the additional requirements for what seems like a simple singular task. Objectively, though, the team within the large company might have a higher velocity then a team in a small company because for a given feature A, one had 10 requirements to meet while the other only had 3.

This doesn't mean that big companies are not slower than start-ups because many big companies looses their nimbleness due to self-inflicted wounds because of the complexity of increased scale:

Complacency
Hesitancy
Bureaucracy: Navigating the corporate maze can feel like a full-time job (though, let's be real, some structure is necessary).

Complacency

The mantra of "If it ain't broke, don't fix it." can take hold at large companies especially those that have had success. This often comes from the top where the leadership is more comfortable keeping the status quo because things are going well and don't want to risk destabilizing the business. This isn't limited to just mature companies but can happen with growth companies.

No matter if the decision to just stay the course is correct or not, if the sense of complacency from the top seeps into rank-and-file then the culture of complacency will take hold.

Hesitancy

Fear stiles innovation and at large companies there can be a perception that more is at stake at both the company and individual levels. This hesitancy occurs at all levels of the organization. The CEOs might fear how a change will impact their bottom line and individual might fear how it will affect their prospects at the company.

Bureaucracy

When a company grows (a good problem to have!), bureaucracy sets in but bureaucracy itself isn't a bad thing. The truth is that when a company scales up it is going to become more complex and it will require organization. A small 3 person start-up doesn't need any process to communicate effectively, but a 100 member engineering org will descend into chaos without some agreed upon method for working together. Large companies often organically evolve into a matrix organization with defined roles and responsibilities, but fail to recognize the that they went from a single node to a graph of nodes. Company leaders continue to try to optimize each node and they ignore the edges connecting the nodes.

Big companies will have more people who can handle things at the node level while it is mainly the leadership who can affect the edges. The irony is that leadership often focus more on the nodes while their teams struggles to navigate through the edges.

Possible Solutions

So, how do we combat the corporate slow-down? While this isn't a complete solution, here are some strategies:

Embrace Small, Nimble Teams: Have small and empowered teams that can move nimbly. Give the team autonomy to solve the problem which helps against complacency and hesitancy. The small team also reduces need for a lot of process within the team for coordination. It is key for leaders to communicate with them on expectations, be transparent and show trust and support. Leaders should also focus on handling coordination between teams.

Provide a "Guide": Shield engineering teams from excessive corporate overhead. Dedicated project managers (not just program managers who push the burden onto engineers) can handle the administrative load. This is to address bureaucracy.

Focus on External Competition: Internal rivalry is a distraction. Aim for external benchmarks. This addresses complacency and hesitancy when there is an external opponent to focus on.

Leadership Support: Leadership should recognize that they created a matrix for the org and the teams don't need help at the node level but the edge traversal.

Maintain Momentum: Long projects can drain morale. Celebrate small wins and define clear milestones.

Instill Urgency: Set targets based on external market realities, ensuring team buy-in.

The key is to recognize that speed and agility aren't just startup perks. They're essential for any company that wants to thrive in today's fast-paced world. By addressing these challenges head-on, we can unlock the true potential of big companies and empower them to innovate at the speed of change.

Tuesday, March 25, 2025

Why the Model Context Protocol (MCP) Is Confusing

**Update 3/26/2025 **

OpenAI added MCP support to their agent SDK.

--------------------------------

On Nov. 25, 2024, Anthropic published a blog post about the Model Context Protocol (MCP) . It didn’t get too much attention at the time because it wasn’t an announcement of much significance. Suddenly in 2025, MCP has gotten a lot of hype and attention and have also caused a lot of confusion as to why there are so many YouTube face talking about it.

It was nice that Anthropic published how they connect Claude with tools in the Claude Desktop app even if the post was a bit of marketing to sell it as an standard and to encourage an open community. There is a technical aspect (a protocol) to it, but it felt like it was a business play to get developers to extend Claude with plugins.

Large Language Models like Claude cannot perform any actions. They’re like a brain with no body. They might know that an email should be sent, but they can’t actually send the email. To connect the “thought” (send email) with the action requires basic programming. Using MCP is how Anthropic does it but it isn’t the only way. Let’s take a look at various ways that this is accomplished and then see how MCP fits in.

You Are the Agent

In this scenario, a person goes to claude.ai and has a conversation with Claude about writing an email to invite someone to lunch. Claude generates the email body letting the person know to copy it an email program to send. The person manually copies that text into their email or calendar app and sends the invitation. The person is the agent because they are performing the action.

Using an AI Assistant

Here, a person uses an app (this can be a web app, desktop app, mobile app, etc.) such as a personal assistant ala Jarvis from Iron Man. The user asks Jarvis to send an email to invite someone for lunch. Jarvis composes the invitation message and sends the invitation through a calendar app so that it also records the event on the calendar. So how does Jarvis do this?

Method 1

Jarvis sends your question to LLM along with prompts describing available tool(s).
Jarvis looks at the response to determine what tools to use.
Jarvis executes the chosen tool(s) through the tool API
The results are sent back to LLM
LLM formulates a natural language response
The response is displayed to you!

In the early days (2023), this might be done like this:

The user speaks to Jarvis, “Jarvis, invite Thor to lunch next Wednesday.”

The Jarvis code passes the text to an LLM along with an additional prompt:

Respond to the user, but if the request is to add to a calendar then respond in JSON:

  {
    Tool: “calendar”
    Invitee: name
    Date: date
    Time: time
    Body: message
  }

The Jarvis program will get the response and if the response is a tool response it will parse the JSON and call the calendar API.

The Jarvis code calls the LLM again with the text, “tell the user that the invite was successfully sent” and then return the response to the user.

Method 2

Developer registers available tools
Jarvis sends your question to LLM
LLM analyzes the available tools and decides which one(s) to use
Jarvis executes the chosen tool(s) through the tool API
The results are sent back to LLM
LLM formulates a natural language response
The response is displayed to you!

An enhancement was added to many LLM’s API to allow developers to register tools, their purpose and parameters. The Jarvis code will register a tool called “calendar”, give it a description such as “Tool to add, update and remove user’s calendar.”, and what parameters it needed.

Now, when Jarvis passes “Jarvis, invite Thor to lunch next week,” to the LLM, it will respond with JSON and Jarvis can call the calendar API.

The Jarvis code calls the LLM again with the text, “tell the user that the invite was successfully sent” and then return the response to the user.

Method 3 (MCP)

User registers available tools.
Jarvis sends your question to Claude
Claude analyzes the available tools and decides which one(s) to use
Jarvis executes the chosen tool(s) through the MCP server who calls the tool API
The results are sent back to Claude
Claude formulates a natural language response
The response is displayed to you!

With MCP, the user (on desktop/mobile) or developer (on cloud) registers MCP servers with Jarvis. Jarvis can then get the tools description from the MCP Server which it passes to the LLM.

When Jarvis passes “Jarvis, invite Thor to lunch next week,” to the LLM, the LLM will determine the tool to use.

Jarvis will then call the MCP server to send the calendar invite.

The Jarvis code calls the LLM again with the text, “tell the user that the invite was successfully sent” and then return the response to the user.

Comparison

With the MCP, tool registration is passed to the user and the tool description is handed off to the tool developer, but otherwise the steps remain the same.

Method 2	Method 3
Developer registers available tools Jarvis sends your question to LLM LLM analyzes the available tools and decides which one(s) to use Jarvis executes the chosen tool(s) through the tool API The results are sent back to LLM LLM formulates a natural language response The response is displayed to you!	User registers available tools. Jarvis sends your question to Claude Claude analyzes the available tools and decides which one(s) to use Jarvis executes the chosen tool(s) through the MCP server who calls the tool API The results are sent back to Claude Claude formulates a natural language response The response is displayed to you!

Comparing the different methods shows that the steps are the same but just the implementation is different. This is one reason there’s a lot of confusion because seems to be very little benefit.

Having a standard protocol can be advantageous but only when all the LLM adopts it otherwise it is just how to interact with Claude.

MCP servers are potentially reusable and might ease integration which is a benefit since it’ll be like having only one API to learn. This requires wide adoption and availability which isn’t a given even if it is backed by one of the big LLM providers.

Shortcomings of the MCP

As a protocol, there are a lot of shortcomings and technical benefits are minor.

Some technical shortcomings are:

There’s no discovery mechanism other than manual registration of MCP servers.
There’s now extra MCP servers in the tech stack that can be achieved by a library.

Thus the main benefits will mainly come if the protocol is adopted as a standard.

Friday, March 21, 2025

Running Kokoro-82M Text-to-Speech on Fedora with Nivdia GPU with Podman

I got interested in Kokoro, a new text-to-speech model that only uses 82 million parameters but is one of the top models on the TTS leaderboard. I wanted to run it locally and a quick way to try it is to use Kokoro-FastAPI which comes in a Docker container. The README on Kokoro-FastAPI’s github has instructions using Docker (with or without GPU), but I’m using Podman so I need to do some setup on Fedora to enable Podman access the GPU.

The instructions on the Podman and Nvidia site have you set up an Nvidia Repository to get the container kit that enables Podman to access the GPU, but in my previous post on installing Nvidia and CUDA drivers on Fedora I mentioned that there can be dependency conflicts. I wasn't sure if the Nvidia container kit might also cause problems, but fortunately, you can just install the packages from Fedora's repo and avoid possible headaches:

sudo dnf install golang-github-nvidia-container-toolkit

Assuming that you already installed Podman (if not, follow the Fedora doc on installing Docker and/or Podman) and you have an GPU, you can download and run:

docker run --device nvidia.com/gpu=all -p 8880:8880 ghcr.io/remsky/kokoro-fastapi-gpu:v0.2.2

If SELinux blocks Podman from accessing the GPU, you can follow Podman's instruction about giving permission for containers to access the GPU:

sudo setsebool -P container_use_devices true

Then run the `docker run ...` command above and it should start.

This a quick way to give Kokoro a try. I'll probably try to run it next with Go by using the onnxruntime to load the Kokoro-Onnx model.

Tuesday, February 18, 2025

How I Avoid Doomscrolling/Doomsurfing

For those not familiar with term, doomscrolling, Wikipedia describes it as:

Doomscrolling or doomsurfing is the act of spending an excessive amount of time reading large quantities of news, particularly negative news, on the web and social media. (Wikipedia)

There are negative consequences of doomscrolling on people's mental and physical health such as increased stress, anxiety, depression, isolation, etc. Suggestions on how to break the habit and combat its negative effects include limiting the amount of screen time and seek out more positive news. In our current environment there are numerous powerful forces working to keep people doomscrolling such as corporations prioritizing engagement (keeping you hooked), publishers vying for your attention (often through negative news), and political leaders fueling fear.

Although I don't spend much time on social media, I do regularly read the news, follow current events and various feeds on topics I'm interested in. To avoid doomscrolling some people are able to stop following the news altogether, but I find that to be difficult to achieve for myself. Since publishers don't provide readers much control over what is shown, I built my own news aggregation site: news.lazyhacker.com.

Now, instead of seeing what publishers want me to see:

Or these set of headlines from feeds (which also illustrate how much political news is pushed on to us:

- Judge Chutkan rejects call from Democratic AGs for temporary restraining order blocking DOGE’s access to federal data - CNN
- Russia and US agree to work toward ending Ukraine war in a remarkable diplomatic shift - The Associated Press
- Pope Francis, still hospitalized, has pneumonia in both lungs - The Washington Post
- Fact Sheet: President Donald J. Trump Expands Access to In Vitro Fertilization (IVF) - The White House
- National Science Foundation fires roughly 10% of its workforce - NPR
- 'Executive order' cited as reason for sudden closure of JFK Library in Boston - WCVB Boston
- Ensuring Accountability for All Agencies - The White House
- Native American Activist Leonard Peltier Released From Prison - The New York Times
- Donald Trump signals Ukraine should hold elections as part of Russia peace deal - Financial Times
- Senate GOP pushes ahead with budget bill that funds Trump's mass deportations and border wall - The Associated Press
- Brazil Charges Bolsonaro With Attempting a Coup - The New York Times

I see a variety of headlines based on my own preferences:

- Pope Francis, still hospitalized, has pneumonia in both lungs - The Washington Post
- National Science Foundation fires roughly 10% of its workforce - NPR
- 'Executive order' cited as reason for sudden closure of JFK Library in Boston - WCVB Boston
- Rare deep-sea ‘doomsday fish’ washes up on Canary Islands coast - The Independent
- Hamas to release 6 more hostages, bodies of 4 others - ABC News
- Dramatic video shows moment Delta plane flipped after landing in Toronto - ABC News
- Futures Rise After S&P 500 Hits High; Two Earnings Losers Late - Investor's Business Daily
- Nvidia’s 50-series cards drop support for PhysX, impacting older games - Ars Technica
- AMD Ryzen AI Max+ 395 Analysis - Strix Halo to rival Apple M4 Pro/Max with 16 Zen 5 cores and iGPU on par with RTX 4070 Laptop - Notebookcheck.net
- Nintendo is killing its Gold Points loyalty program - Engadget
iPhone 17 Air Leaks Look More Like Google Pixel - Forbes

The source of the headlines can come from different feeds from places like Google News, Reddit, and any source that offers a RSS feed. The site takes the headlines from the feeds and run it through a set of rules that I defined in natural language (e.g. "Remove political headlines, headlines about political figures or those who are not politicians but politically active.") to strip out any headlines that I might not want to see. I purposely don't show any images and only update the site every couple of hours. The former reduces the chance of me wanting to read an article because of the image rather then the substance and the latter reduces my urge to constantly refresh because I know that there will be no new headlines for another 2 hours.

Now, instead of finding myself being lured into doomscrolling, I can go to my site and see something like this:

Saturday, February 15, 2025

WebAssembly (WASM) with Go (Golang) Basic Example

I first wrote about using Go for WebAssembly (WASM) 6 years ago right before the release of Go 1.11 which was the first time Go supported compiling to WASM. Go's initial support for WASM had many limitations (some I listed in my initial article) which have since been addressed so I decided to revisit the topic with some updated example of using Go for WASM.

Being able to compile code to WASM now allow:

Go programs to run in the browser.
Go functions to be called by JavaScript in the browser.
Go code to call JavaScript functions through syscall/js.
Go code access to the DOM.

Setup

Go's official Wiki now has an article on the basics of using Go for WASM including how to set the compile target and setup.

A quick summary of the steps to the process:

Compile to WASM with the output file ending as wasm since it's likely that the mime type set in /etc/mime probably use the wasm extension.

> GOOS=js GOARCH=wasm go build -o <filename>.wasm

Copy the JavaScript support file to your working directory (wherever your http server will serve it from). It's necessary to use the matching wasm_exec.js for the version of the Go being used so maybe put this as part of the build script.

> cp "$(go env GOROOT)/lib/wasm/wasm_exec.js" .

Then add the following to the html file to load the WASM binary:

<script src="wasm_exec.js"></script>
<script>
    const go = new Go();
    WebAssembly.instantiateStreaming(fetch("main.wasm"), go.importObject).then((result) => {
                go.run(result.instance);
    });
</script>

Keeping It Running

It is a good starting point but the Go code example is too simplistic. It only demonstrates that the WASM binary that is created by Go can be loaded by having write a line to the browser's console. The Go program basically gets loaded by the browser, prints a line and exits. Most of the time, it's probably desirable to have the WASM binary get loaded and stay running. This can be achieved by either having having a channel that keeps waiting:

c := make(chan struct{}, 0)
<- c

or even easier:

select {}

Have either of these at the end of main() will keep the program alive after it gets loaded into the browser.

In Place of JavaScript

Being able to access the DOM excites me the most because it allows me to avoid writing JavaScript followed by being able to run Go programs in the browsers. While I think the inter-op between Go and JavaScript is probably the most practical application, it's not something I've had to do much since I'm not a front-end developing doing optimizations or trying to reuse Go code between the front-end and back-end.

I don't mind using HTML for UI development or even CSS, I'm just personally not a fan of JavaScript. This isn't to say that it is bad, just I prefer other languages just like some people prefer C++, Java, Python, etc. I don't have fun writing JavaScript like I do with with Go if though I know JavaScript.

Take a basic example of a web app (index.html) with a button to illustrate:

<!DOCTYPE html>                                                                 
  <html lang="en">                                                                  
  <head>                                                                          
      <meta charset="UTF-8">                                                        
      <meta name="viewport" content="width=device-width, initial-scale=1.0">        
      <title>Example</title>                                         
  </head>                                                                           
  <body>                                                                          
                                                                                  
      <button id="myButton">Click Me</button>                                                                                                     
                                                                                  
  </body>
  </html>

JavaScript is used to attach an event to it so that when the button is clicked, an alert message pops up:

    // Select the button element                                            
    const button = document.getElementById('myButton');                     
                                                                                  
    // Attach an event listener to the button                               
    button.addEventListener('click', function() {                           
       alert('Button clicked!');                                           
    });

With WASM, the JavaScript can be replaced with Go code:

 package main                                                                       
                                                                                     
  import (                                                                                                                                                   
                                                                                     
      "honnef.co/go/js/dom/v2"                                                       
  )                                                                                  
                                                                                  
  func main() {                                                                   
                                                                                  
      document := dom.GetWindow().Document()                                      
           
      // Select the button element                                                                       
      button := document.GetElementByID("myButton")                               
                                                                                  
      // Attach an event listener to the button
      button.AddEventListener("click", false, func(dom.Event) {                   
          dom.GetWindow().Alert("Button clicked!")                                
      })                                                                          
                                                                                  
      select {}
  }

In this case, the Go code looks extremely similar to the JavaScript code because I'm using the honnef.co/go/js/dom/v2 package. It is a Go binding to the JavaScript DOM APIs and I find that it makes it more convenient than using syscall/js directly.

Why do I prefer this over writing JavaScript especially when they look so similar? The main reason is that most code is not just calling an API. There's other logic that are implemented and for those, I can use Go and Go's libraries along with the benefits of a compiled and type safe language.

There are still things that needs to be considered before just using Go and WASM for general consumer production web app. The binary size can be large so it needs to be considered for your audience, but if I'm doing my own hobby project for myself or corporate apps where I know the user have fast connections, or if app performance and functionality outweighs the initial download and memory usage, I'd try to use it.

Monday, February 10, 2025

The Death of Software Engineering Is Premature

There is a lot to be excited about when it comes to advancement in language models. The people that benefit the most are software engineers because it can enhance the productivity of knowledgeable engineers.

While a simple prompt such as "build me a web app with a conversation interface," will result in a functional web app (this is great because building a basic commodity application is now accessible to everyone), these aren't the type of applications software engineers are normally tasked to write. Software engineers are tasked with building multi-layered and evolving software that can't be fully described with a few prompts. To build the software that businesses need requires skilled and knowledgeable people to direct the engineering work of the AI.

I built a simple app to filter news headlines that is small enough to be digestible in a post that I think shows how far a layman can go with building software using LLMs and why software engineers are still needed.

The First Prompt

Let's start with what many executives might think makes software engineers unnecessary:

Build me an app that will filter out political articles from Google News.

ChatGPT understood enough to generate an app using keywords to filter the headlines. One time it created a React app and a second time it used Python. Both times it used newsapi.org to get the news feeds and require you to understand how to build and run the app. The main issue is that the app isn't really what I wanted. It provides news search which the results are then matched with keywords to decide what to filter out. I wanted the news that I normally see when I visit news.google.com minus the political articles so I tell ChatGPT precisely that:

I don't want to search for news. I want to see the articles that I see when visiting news.google.com minus political articles

The first time I asked ChatGPT it understood this enough to switch to using the Google News RSS feed which is excellent\! The second time, it concluded that the way is to scrape news.google.com. Both of these prompts highlight that some specialized knowledge is needed. Does the programming language matter? Should it use RSS or web scraping? How do you run the code it generates? Can these questions be ignored? Who in the organization does the CEO expect to be able to answer these questions?

The Second Prompt

While the CEO might not be able to use the AI, it is possible that a non-engineer or someone who doesn't know how to program can know enough to give more details in the prompt that improves on the first prompt. A technical product manager could give the AI the extra details:

single page web app using material design

have a header with the company logo on the left
have a settings menu on the right side of the header
...

web app will get the list of headlines to display
pull headlines from RSS feeds and have a LLM return the items that are not political
build it with language X on the backend and basic Javascript and CSS on the front end.

(4) was included to demonstrate how software often needs to fit in with a business' existing infrastructure. If the AI returns a React app but there is no infrastructure to build and deploy React apps then we'd be looking at additional costs and efforts to add that ability simply because the AI had no knowledge of what is reasonable for a specific business.

LLMs are capable of generating this app although the generated code didn't actually work and I had to guide it to how to fix things but for now let's assume that the LLM generated a fully working app.

If we were to stop here then we might conclude that software engineers aren't needed anymore, but I believe the conclusion is actually that for companies that previously did not have engineers that they now have access to some skills their organization never had before. For example, a small company who has an office manager built a spreadsheet that they're running their business on would benefit because while their spreadsheet worked it is still limited. The next step to expand the spreadsheet that would previously require some coding knowledge to extend the worksheet can now be done by the office manager with the aid of an AI. There is cost saving for the company because they didn't have to hire a consultant (these companies probably don't have enough work to hire a full time engineer to be on staff).

Remember that I left out a number of things that the AI cannot do including turning the code into an app and deploying the app, but the biggest hurdle would be if the AI did not meet all the requirements for the app initially. I had success prompting the AI to make certain changes and enhancements but multiple times the AI would make a change that didn't work and go further-and-further down the rabbit hole and no prompting got it to fix the problem until I have it very specific instructions on how to fix ("In function X, change the max/min values to be Y for variable V").

Basically, maintaining, fixing and enhancing the app is where the challenge is and that's where most engineers are spending their time.

The Third Prompt

Prompt 3 (in real life was actually what I did first) was the fastest and most productive way towards building a working application and this was to have a design and give the AI many specific implementation instructions. I knew the architecture, algorithm and code structure and used that knowledge to guide the AI so that it was essentially very fast typist of code:

Write a FetchRSS function that takes in a string value for the RSS feed configuration file path.
In FetchRSS, open up the configuration file and loop through each line to get the URL.
For each URL, fetch it and parse the RSS response into a slice of strings
Oh, ignore any lines that are blank or starts with '\#'
...

Since I enjoyed typing out code, I tended to write my own stuff but for the "boring" codes (e.g. handling when err \!= nil) I'll have the AI write it.

I was able to complete the code much faster, make fewer trips to look up references and documentation and with few bugs from typos.

Here's the problem, though. While AI is capable of generating valid code, it isn't "working" code. It isn't code that can be used directly in a business' infrastructure. AI is still struggling to understand an entire code base and writing code that works within a business requires also understanding the environment the code is running in. If the code will be run in complete isolation then it might be able to run but even a simple function such as "GetUserName" depends on where the user name is stored? What integration must the AI be aware of in order to get the user name? In a real environment, the AI simply just gives code snippets that the engineer must still adapt into the organization's infrastructure.

Conclusion

Having an AI capable of building software for your business is not realistic. My example app is too simplistic to be any company's product and it still wasn't able to do that.

Having a knowledgeable person on staff using AI will increase your staff's ability to do their jobs better and allow companies to do things they previously couldn't but most of these things are not things software engineers typically do. If companies try to have non-engineers do software engineering work with the AI will likely result in decreased productivity. Any gains from having something done quickly at the beginning will also be quickly overshadowed by the inability to maintain and enhance the software.

It is ultimately a decision companies have to make for themselves. Is a % reduction in compensating engineers a greater value then expanding by factors the productivity and capabilities of the engineers (reduce cost vs growth)?

What is complex today will not be as complex tomorrow. They will become common and AI will be able to take care of those things, but as AI takes over more of the common stuff that will allow engineers to tackle the next level of complexity with more originality because businesses will need it to survive. There are leaders at technology companies boasting about being able to get rid of their software engineers or how they will stop hiring more engineers. These leaders are making a choice to make cheap commodity products in exchange for growth and innovation, but they might find themselves racing to the bottom instead of accelerating to the top.

Can AI someday outpace people? Maybe. But it's not now. Declaring that engineers aren't needed is pre-mature.

Using AI/LLM to Implement a News Headline Filter

There are news topics that I'd rather avoid being bombarded with, but options for filtering headlines on the news sites are generally very limited. You can tell Google News to block certain sources and there's an "fewer like this" option which doesn't seem to do anything, but neither will block topics (e.g."politics", "reality TV shows", etc. ) so you still end up getting bombarded with things you don't want to see even in your "personalized" views. Fortunately, sites like Google News provide RSS feeds that makes it easier to get the list of headlines, links and descriptions and avoid trying to scrape websites which can be brittle as the sites can change their layout at any time.

I decided to write my own filter take out things that I'm not interested in and publish the result to a site that I can access from any browser. The simplest way (and one that doesn't even need site to host the result) is to have a list of blocked words and drop any headlines with those words. As long as the app can access the RSS feed even a mobile app can easily handle this kind of filtering, but creating and maintaining an effective block list becomes a challenge. A political news article isn't going to be "Political Article on City Council Votes On Artificial Turf at Parks. This is a good use case for using a language model to categorize a headline instead of using a keyword filter.

I created a prompt in natural language explaining what I don't want to see along with some additional details to add to common definitions and then sent my instructions with the list of headlines to the language model and have it return a filtered list:

Remove all political headlines. Headlines that includes political figures, and celebrities who are active in politics such as Elon Musk should also be removed.

Being able to use natural language to handle categorization makes it so much easier to build the app. The categorization would've been the most challenging part of the project, but the LLM allowed me to get good results very quickly (took just a few minutes to get the first results but some more time to tweak the prompt). Another benefit with the large language models such as Gemini is that it understand multiple languages so while I gave my instructions in English, it can filter headlines in French, Chinese, Japanese, etc.

Using an language model does mean giving up some control such as relying on what the model interprets "political" to mean. Prompts can help refine the model's interpretation but sometimes a headline makes it through the filter and it is not as easy to determine compared to being able to see the algorithm and determine the reasoning.

I encountered a problem where the LLM's response stopped midway. This was because LLMs have limits on input and output tokens (how much info you can send out and the size of the response it will send back). The input token limit is so high that I'm not likely to have enough headlines to even remotely approach it's limit (Gemini is 1 million tokens for the free tier and 2 million tokens for the paid tier and I'll be using around 20,000). The output limit is much smaller (~8k) so if I wanted it to send back the complete details of the filtered headlines (title and link) it won't be able to. To address this problem, I send the LLM the headlines with an index and have it return just the index. If it was a 100 headlines then size of the output is less than 200 tokens.

A cron job will execute the program and write the result as JSON to the server where the web page will load the JSON and display the headlines:

The code is on Github and you can see it is very simple program.

Sunday, January 26, 2025

Upgrading from Fedora 40 to 41

Decided it's time to upgrade from Fedora 40 to 41 and it was a smooth upgrade. Will update if I run into any issues, but I've already been running 41 on a laptop for a couple of months so I know that 41 works for what I need and this time it was to upgrade systems that was already running an earlier version.

Pages

Monday, July 7, 2025

Wednesday, June 18, 2025

Wednesday, April 23, 2025

Scale

Complacency

Hesitancy

Bureaucracy

Possible Solutions

Tuesday, March 25, 2025

You Are the Agent

Using an AI Assistant

Method 1

Method 2

Method 3 (MCP)

Comparison

Shortcomings of the MCP

Friday, March 21, 2025

Tuesday, February 18, 2025

Saturday, February 15, 2025

Setup

Keeping It Running

In Place of JavaScript

Monday, February 10, 2025

The First Prompt

The Second Prompt

The Third Prompt

Conclusion

Sunday, January 26, 2025