Lazy Hacker Babble

Tuesday, March 25, 2025

Why the Model Context Protocol (MCP) Is Confusing

**Update 3/26/2025 **

OpenAI added MCP support to their agent SDK.

--------------------------------

On Nov. 25, 2024, Anthropic published a blog post about the Model Context Protocol (MCP) . It didn’t get too much attention at the time because it wasn’t an announcement of much significance. Suddenly in 2025, MCP has gotten a lot of hype and attention and have also caused a lot of confusion as to why there are so many YouTube face talking about it.

It was nice that Anthropic published how they connect Claude with tools in the Claude Desktop app even if the post was a bit of marketing to sell it as an standard and to encourage an open community. There is a technical aspect (a protocol) to it, but it felt like it was a business play to get developers to extend Claude with plugins.

Large Language Models like Claude cannot perform any actions. They’re like a brain with no body. They might know that an email should be sent, but they can’t actually send the email. To connect the “thought” (send email) with the action requires basic programming. Using MCP is how Anthropic does it but it isn’t the only way. Let’s take a look at various ways that this is accomplished and then see how MCP fits in.

You Are the Agent

In this scenario, a person goes to claude.ai and has a conversation with Claude about writing an email to invite someone to lunch. Claude generates the email body letting the person know to copy it an email program to send. The person manually copies that text into their email or calendar app and sends the invitation. The person is the agent because they are performing the action.

Using an AI Assistant

Here, a person uses an app (this can be a web app, desktop app, mobile app, etc.) such as a personal assistant ala Jarvis from Iron Man. The user asks Jarvis to send an email to invite someone for lunch. Jarvis composes the invitation message and sends the invitation through a calendar app so that it also records the event on the calendar. So how does Jarvis do this?

Method 1

Jarvis sends your question to LLM along with prompts describing available tool(s).
Jarvis looks at the response to determine what tools to use.
Jarvis executes the chosen tool(s) through the tool API
The results are sent back to LLM
LLM formulates a natural language response
The response is displayed to you!

In the early days (2023), this might be done like this:

The user speaks to Jarvis, “Jarvis, invite Thor to lunch next Wednesday.”

The Jarvis code passes the text to an LLM along with an additional prompt:

Respond to the user, but if the request is to add to a calendar then respond in JSON:

  {
    Tool: “calendar”
    Invitee: name
    Date: date
    Time: time
    Body: message
  }

The Jarvis program will get the response and if the response is a tool response it will parse the JSON and call the calendar API.

The Jarvis code calls the LLM again with the text, “tell the user that the invite was successfully sent” and then return the response to the user.

Method 2

Developer registers available tools
Jarvis sends your question to LLM
LLM analyzes the available tools and decides which one(s) to use
Jarvis executes the chosen tool(s) through the tool API
The results are sent back to LLM
LLM formulates a natural language response
The response is displayed to you!

An enhancement was added to many LLM’s API to allow developers to register tools, their purpose and parameters. The Jarvis code will register a tool called “calendar”, give it a description such as “Tool to add, update and remove user’s calendar.”, and what parameters it needed.

Now, when Jarvis passes “Jarvis, invite Thor to lunch next week,” to the LLM, it will respond with JSON and Jarvis can call the calendar API.

The Jarvis code calls the LLM again with the text, “tell the user that the invite was successfully sent” and then return the response to the user.

Method 3 (MCP)

User registers available tools.
Jarvis sends your question to Claude
Claude analyzes the available tools and decides which one(s) to use
Jarvis executes the chosen tool(s) through the MCP server who calls the tool API
The results are sent back to Claude
Claude formulates a natural language response
The response is displayed to you!

With MCP, the user (on desktop/mobile) or developer (on cloud) registers MCP servers with Jarvis. Jarvis can then get the tools description from the MCP Server which it passes to the LLM.

When Jarvis passes “Jarvis, invite Thor to lunch next week,” to the LLM, the LLM will determine the tool to use.

Jarvis will then call the MCP server to send the calendar invite.

The Jarvis code calls the LLM again with the text, “tell the user that the invite was successfully sent” and then return the response to the user.

Comparison

With the MCP, tool registration is passed to the user and the tool description is handed off to the tool developer, but otherwise the steps remain the same.

Method 2	Method 3
Developer registers available tools Jarvis sends your question to LLM LLM analyzes the available tools and decides which one(s) to use Jarvis executes the chosen tool(s) through the tool API The results are sent back to LLM LLM formulates a natural language response The response is displayed to you!	User registers available tools. Jarvis sends your question to Claude Claude analyzes the available tools and decides which one(s) to use Jarvis executes the chosen tool(s) through the MCP server who calls the tool API The results are sent back to Claude Claude formulates a natural language response The response is displayed to you!

Comparing the different methods shows that the steps are the same but just the implementation is different. This is one reason there’s a lot of confusion because seems to be very little benefit.

Having a standard protocol can be advantageous but only when all the LLM adopts it otherwise it is just how to interact with Claude.

MCP servers are potentially reusable and might ease integration which is a benefit since it’ll be like having only one API to learn. This requires wide adoption and availability which isn’t a given even if it is backed by one of the big LLM providers.

Shortcomings of the MCP

As a protocol, there are a lot of shortcomings and technical benefits are minor.

Some technical shortcomings are:

There’s no discovery mechanism other than manual registration of MCP servers.
There’s now extra MCP servers in the tech stack that can be achieved by a library.

Thus the main benefits will mainly come if the protocol is adopted as a standard.

Friday, March 21, 2025

Running Kokoro-82M Text-to-Speech on Fedora with Nivdia GPU with Podman

I got interested in Kokoro, a new text-to-speech model that only uses 82 million parameters but is one of the top models on the TTS leaderboard. I wanted to run it locally and a quick way to try it is to use Kokoro-FastAPI which comes in a Docker container. The README on Kokoro-FastAPI’s github has instructions using Docker (with or without GPU), but I’m using Podman so I need to do some setup on Fedora to enable Podman access the GPU.

The instructions on the Podman and Nvidia site have you set up an Nvidia Repository to get the container kit that enables Podman to access the GPU, but in my previous post on installing Nvidia and CUDA drivers on Fedora I mentioned that there can be dependency conflicts. I wasn't sure if the Nvidia container kit might also cause problems, but fortunately, you can just install the packages from Fedora's repo and avoid possible headaches:

sudo dnf install golang-github-nvidia-container-toolkit

Assuming that you already installed Podman (if not, follow the Fedora doc on installing Docker and/or Podman) and you have an GPU, you can download and run:

docker run --device nvidia.com/gpu=all -p 8880:8880 ghcr.io/remsky/kokoro-fastapi-gpu:v0.2.2

If SELinux blocks Podman from accessing the GPU, you can follow Podman's instruction about giving permission for containers to access the GPU:

sudo setsebool -P container_use_devices true

Then run the `docker run ...` command above and it should start.

This a quick way to give Kokoro a try. I'll probably try to run it next with Go by using the onnxruntime to load the Kokoro-Onnx model.

Tuesday, February 18, 2025

How I Avoid Doomscrolling/Doomsurfing

For those not familiar with term, doomscrolling, Wikipedia describes it as:

Doomscrolling or doomsurfing is the act of spending an excessive amount of time reading large quantities of news, particularly negative news, on the web and social media. (Wikipedia)

There are negative consequences of doomscrolling on people's mental and physical health such as increased stress, anxiety, depression, isolation, etc. Suggestions on how to break the habit and combat its negative effects include limiting the amount of screen time and seek out more positive news. In our current environment there are numerous powerful forces working to keep people doomscrolling such as corporations prioritizing engagement (keeping you hooked), publishers vying for your attention (often through negative news), and political leaders fueling fear.

Although I don't spend much time on social media, I do regularly read the news, follow current events and various feeds on topics I'm interested in. To avoid doomscrolling some people are able to stop following the news altogether, but I find that to be difficult to achieve for myself. Since publishers don't provide readers much control over what is shown, I built my own news aggregation site: news.lazyhacker.com.

Now, instead of seeing what publishers want me to see:

Or these set of headlines from feeds (which also illustrate how much political news is pushed on to us:

- Judge Chutkan rejects call from Democratic AGs for temporary restraining order blocking DOGE’s access to federal data - CNN
- Russia and US agree to work toward ending Ukraine war in a remarkable diplomatic shift - The Associated Press
- Pope Francis, still hospitalized, has pneumonia in both lungs - The Washington Post
- Fact Sheet: President Donald J. Trump Expands Access to In Vitro Fertilization (IVF) - The White House
- National Science Foundation fires roughly 10% of its workforce - NPR
- 'Executive order' cited as reason for sudden closure of JFK Library in Boston - WCVB Boston
- Ensuring Accountability for All Agencies - The White House
- Native American Activist Leonard Peltier Released From Prison - The New York Times
- Donald Trump signals Ukraine should hold elections as part of Russia peace deal - Financial Times
- Senate GOP pushes ahead with budget bill that funds Trump's mass deportations and border wall - The Associated Press
- Brazil Charges Bolsonaro With Attempting a Coup - The New York Times

I see a variety of headlines based on my own preferences:

- Pope Francis, still hospitalized, has pneumonia in both lungs - The Washington Post
- National Science Foundation fires roughly 10% of its workforce - NPR
- 'Executive order' cited as reason for sudden closure of JFK Library in Boston - WCVB Boston
- Rare deep-sea ‘doomsday fish’ washes up on Canary Islands coast - The Independent
- Hamas to release 6 more hostages, bodies of 4 others - ABC News
- Dramatic video shows moment Delta plane flipped after landing in Toronto - ABC News
- Futures Rise After S&P 500 Hits High; Two Earnings Losers Late - Investor's Business Daily
- Nvidia’s 50-series cards drop support for PhysX, impacting older games - Ars Technica
- AMD Ryzen AI Max+ 395 Analysis - Strix Halo to rival Apple M4 Pro/Max with 16 Zen 5 cores and iGPU on par with RTX 4070 Laptop - Notebookcheck.net
- Nintendo is killing its Gold Points loyalty program - Engadget
iPhone 17 Air Leaks Look More Like Google Pixel - Forbes

The source of the headlines can come from different feeds from places like Google News, Reddit, and any source that offers a RSS feed. The site takes the headlines from the feeds and run it through a set of rules that I defined in natural language (e.g. "Remove political headlines, headlines about political figures or those who are not politicians but politically active.") to strip out any headlines that I might not want to see. I purposely don't show any images and only update the site every couple of hours. The former reduces the chance of me wanting to read an article because of the image rather then the substance and the latter reduces my urge to constantly refresh because I know that there will be no new headlines for another 2 hours.

Now, instead of finding myself being lured into doomscrolling, I can go to my site and see something like this:

Saturday, February 15, 2025

WebAssembly (WASM) with Go (Golang) Basic Example

I first wrote about using Go for WebAssembly (WASM) 6 years ago right before the release of Go 1.11 which was the first time Go supported compiling to WASM. Go's initial support for WASM had many limitations (some I listed in my initial article) which have since been addressed so I decided to revisit the topic with some updated example of using Go for WASM.

Being able to compile code to WASM now allow:

Go programs to run in the browser.
Go functions to be called by JavaScript in the browser.
Go code to call JavaScript functions through syscall/js.
Go code access to the DOM.

Setup

Go's official Wiki now has an article on the basics of using Go for WASM including how to set the compile target and setup.

A quick summary of the steps to the process:

Compile to WASM with the output file ending as wasm since it's likely that the mime type set in /etc/mime probably use the wasm extension.

> GOOS=js GOARCH=wasm go build -o <filename>.wasm

Copy the JavaScript support file to your working directory (wherever your http server will serve it from). It's necessary to use the matching wasm_exec.js for the version of the Go being used so maybe put this as part of the build script.

> cp "$(go env GOROOT)/lib/wasm/wasm_exec.js" .

Then add the following to the html file to load the WASM binary:

<script src="wasm_exec.js"></script>
<script>
    const go = new Go();
    WebAssembly.instantiateStreaming(fetch("main.wasm"), go.importObject).then((result) => {
                go.run(result.instance);
    });
</script>

Keeping It Running

It is a good starting point but the Go code example is too simplistic. It only demonstrates that the WASM binary that is created by Go can be loaded by having write a line to the browser's console. The Go program basically gets loaded by the browser, prints a line and exits. Most of the time, it's probably desirable to have the WASM binary get loaded and stay running. This can be achieved by either having having a channel that keeps waiting:

c := make(chan struct{}, 0)
<- c

or even easier:

select {}

Have either of these at the end of main() will keep the program alive after it gets loaded into the browser.

In Place of JavaScript

Being able to access the DOM excites me the most because it allows me to avoid writing JavaScript followed by being able to run Go programs in the browsers. While I think the inter-op between Go and JavaScript is probably the most practical application, it's not something I've had to do much since I'm not a front-end developing doing optimizations or trying to reuse Go code between the front-end and back-end.

I don't mind using HTML for UI development or even CSS, I'm just personally not a fan of JavaScript. This isn't to say that it is bad, just I prefer other languages just like some people prefer C++, Java, Python, etc. I don't have fun writing JavaScript like I do with with Go if though I know JavaScript.

Take a basic example of a web app (index.html) with a button to illustrate:

<!DOCTYPE html>                                                                 
  <html lang="en">                                                                  
  <head>                                                                          
      <meta charset="UTF-8">                                                        
      <meta name="viewport" content="width=device-width, initial-scale=1.0">        
      <title>Example</title>                                         
  </head>                                                                           
  <body>                                                                          
                                                                                  
      <button id="myButton">Click Me</button>                                                                                                     
                                                                                  
  </body>
  </html>

JavaScript is used to attach an event to it so that when the button is clicked, an alert message pops up:

    // Select the button element                                            
    const button = document.getElementById('myButton');                     
                                                                                  
    // Attach an event listener to the button                               
    button.addEventListener('click', function() {                           
       alert('Button clicked!');                                           
    });

With WASM, the JavaScript can be replaced with Go code:

 package main                                                                       
                                                                                     
  import (                                                                                                                                                   
                                                                                     
      "honnef.co/go/js/dom/v2"                                                       
  )                                                                                  
                                                                                  
  func main() {                                                                   
                                                                                  
      document := dom.GetWindow().Document()                                      
           
      // Select the button element                                                                       
      button := document.GetElementByID("myButton")                               
                                                                                  
      // Attach an event listener to the button
      button.AddEventListener("click", false, func(dom.Event) {                   
          dom.GetWindow().Alert("Button clicked!")                                
      })                                                                          
                                                                                  
      select {}
  }

In this case, the Go code looks extremely similar to the JavaScript code because I'm using the honnef.co/go/js/dom/v2 package. It is a Go binding to the JavaScript DOM APIs and I find that it makes it more convenient than using syscall/js directly.

Why do I prefer this over writing JavaScript especially when they look so similar? The main reason is that most code is not just calling an API. There's other logic that are implemented and for those, I can use Go and Go's libraries along with the benefits of a compiled and type safe language.

There are still things that needs to be considered before just using Go and WASM for general consumer production web app. The binary size can be large so it needs to be considered for your audience, but if I'm doing my own hobby project for myself or corporate apps where I know the user have fast connections, or if app performance and functionality outweighs the initial download and memory usage, I'd try to use it.

Monday, February 10, 2025

The Death of Software Engineering Is Premature

There is a lot to be excited about when it comes to advancement in language models. The people that benefit the most are software engineers because it can enhance the productivity of knowledgeable engineers.

While a simple prompt such as "build me a web app with a conversation interface," will result in a functional web app (this is great because building a basic commodity application is now accessible to everyone), these aren't the type of applications software engineers are normally tasked to write. Software engineers are tasked with building multi-layered and evolving software that can't be fully described with a few prompts. To build the software that businesses need requires skilled and knowledgeable people to direct the engineering work of the AI.

I built a simple app to filter news headlines that is small enough to be digestible in a post that I think shows how far a layman can go with building software using LLMs and why software engineers are still needed.

The First Prompt

Let's start with what many executives might think makes software engineers unnecessary:

Build me an app that will filter out political articles from Google News.

ChatGPT understood enough to generate an app using keywords to filter the headlines. One time it created a React app and a second time it used Python. Both times it used newsapi.org to get the news feeds and require you to understand how to build and run the app. The main issue is that the app isn't really what I wanted. It provides news search which the results are then matched with keywords to decide what to filter out. I wanted the news that I normally see when I visit news.google.com minus the political articles so I tell ChatGPT precisely that:

I don't want to search for news. I want to see the articles that I see when visiting news.google.com minus political articles

The first time I asked ChatGPT it understood this enough to switch to using the Google News RSS feed which is excellent\! The second time, it concluded that the way is to scrape news.google.com. Both of these prompts highlight that some specialized knowledge is needed. Does the programming language matter? Should it use RSS or web scraping? How do you run the code it generates? Can these questions be ignored? Who in the organization does the CEO expect to be able to answer these questions?

The Second Prompt

While the CEO might not be able to use the AI, it is possible that a non-engineer or someone who doesn't know how to program can know enough to give more details in the prompt that improves on the first prompt. A technical product manager could give the AI the extra details:

single page web app using material design

have a header with the company logo on the left
have a settings menu on the right side of the header
...

web app will get the list of headlines to display
pull headlines from RSS feeds and have a LLM return the items that are not political
build it with language X on the backend and basic Javascript and CSS on the front end.

(4) was included to demonstrate how software often needs to fit in with a business' existing infrastructure. If the AI returns a React app but there is no infrastructure to build and deploy React apps then we'd be looking at additional costs and efforts to add that ability simply because the AI had no knowledge of what is reasonable for a specific business.

LLMs are capable of generating this app although the generated code didn't actually work and I had to guide it to how to fix things but for now let's assume that the LLM generated a fully working app.

If we were to stop here then we might conclude that software engineers aren't needed anymore, but I believe the conclusion is actually that for companies that previously did not have engineers that they now have access to some skills their organization never had before. For example, a small company who has an office manager built a spreadsheet that they're running their business on would benefit because while their spreadsheet worked it is still limited. The next step to expand the spreadsheet that would previously require some coding knowledge to extend the worksheet can now be done by the office manager with the aid of an AI. There is cost saving for the company because they didn't have to hire a consultant (these companies probably don't have enough work to hire a full time engineer to be on staff).

Remember that I left out a number of things that the AI cannot do including turning the code into an app and deploying the app, but the biggest hurdle would be if the AI did not meet all the requirements for the app initially. I had success prompting the AI to make certain changes and enhancements but multiple times the AI would make a change that didn't work and go further-and-further down the rabbit hole and no prompting got it to fix the problem until I have it very specific instructions on how to fix ("In function X, change the max/min values to be Y for variable V").

Basically, maintaining, fixing and enhancing the app is where the challenge is and that's where most engineers are spending their time.

The Third Prompt

Prompt 3 (in real life was actually what I did first) was the fastest and most productive way towards building a working application and this was to have a design and give the AI many specific implementation instructions. I knew the architecture, algorithm and code structure and used that knowledge to guide the AI so that it was essentially very fast typist of code:

Write a FetchRSS function that takes in a string value for the RSS feed configuration file path.
In FetchRSS, open up the configuration file and loop through each line to get the URL.
For each URL, fetch it and parse the RSS response into a slice of strings
Oh, ignore any lines that are blank or starts with '\#'
...

Since I enjoyed typing out code, I tended to write my own stuff but for the "boring" codes (e.g. handling when err \!= nil) I'll have the AI write it.

I was able to complete the code much faster, make fewer trips to look up references and documentation and with few bugs from typos.

Here's the problem, though. While AI is capable of generating valid code, it isn't "working" code. It isn't code that can be used directly in a business' infrastructure. AI is still struggling to understand an entire code base and writing code that works within a business requires also understanding the environment the code is running in. If the code will be run in complete isolation then it might be able to run but even a simple function such as "GetUserName" depends on where the user name is stored? What integration must the AI be aware of in order to get the user name? In a real environment, the AI simply just gives code snippets that the engineer must still adapt into the organization's infrastructure.

Conclusion

Having an AI capable of building software for your business is not realistic. My example app is too simplistic to be any company's product and it still wasn't able to do that.

Having a knowledgeable person on staff using AI will increase your staff's ability to do their jobs better and allow companies to do things they previously couldn't but most of these things are not things software engineers typically do. If companies try to have non-engineers do software engineering work with the AI will likely result in decreased productivity. Any gains from having something done quickly at the beginning will also be quickly overshadowed by the inability to maintain and enhance the software.

It is ultimately a decision companies have to make for themselves. Is a % reduction in compensating engineers a greater value then expanding by factors the productivity and capabilities of the engineers (reduce cost vs growth)?

What is complex today will not be as complex tomorrow. They will become common and AI will be able to take care of those things, but as AI takes over more of the common stuff that will allow engineers to tackle the next level of complexity with more originality because businesses will need it to survive. There are leaders at technology companies boasting about being able to get rid of their software engineers or how they will stop hiring more engineers. These leaders are making a choice to make cheap commodity products in exchange for growth and innovation, but they might find themselves racing to the bottom instead of accelerating to the top.

Can AI someday outpace people? Maybe. But it's not now. Declaring that engineers aren't needed is pre-mature.

Using AI/LLM to Implement a News Headline Filter

There are news topics that I'd rather avoid being bombarded with, but options for filtering headlines on the news sites are generally very limited. You can tell Google News to block certain sources and there's an "fewer like this" option which doesn't seem to do anything, but neither will block topics (e.g."politics", "reality TV shows", etc. ) so you still end up getting bombarded with things you don't want to see even in your "personalized" views. Fortunately, sites like Google News provide RSS feeds that makes it easier to get the list of headlines, links and descriptions and avoid trying to scrape websites which can be brittle as the sites can change their layout at any time.

I decided to write my own filter take out things that I'm not interested in and publish the result to a site that I can access from any browser. The simplest way (and one that doesn't even need site to host the result) is to have a list of blocked words and drop any headlines with those words. As long as the app can access the RSS feed even a mobile app can easily handle this kind of filtering, but creating and maintaining an effective block list becomes a challenge. A political news article isn't going to be "Political Article on City Council Votes On Artificial Turf at Parks. This is a good use case for using a language model to categorize a headline instead of using a keyword filter.

I created a prompt in natural language explaining what I don't want to see along with some additional details to add to common definitions and then sent my instructions with the list of headlines to the language model and have it return a filtered list:

Remove all political headlines. Headlines that includes political figures, and celebrities who are active in politics such as Elon Musk should also be removed.

Being able to use natural language to handle categorization makes it so much easier to build the app. The categorization would've been the most challenging part of the project, but the LLM allowed me to get good results very quickly (took just a few minutes to get the first results but some more time to tweak the prompt). Another benefit with the large language models such as Gemini is that it understand multiple languages so while I gave my instructions in English, it can filter headlines in French, Chinese, Japanese, etc.

Using an language model does mean giving up some control such as relying on what the model interprets "political" to mean. Prompts can help refine the model's interpretation but sometimes a headline makes it through the filter and it is not as easy to determine compared to being able to see the algorithm and determine the reasoning.

I encountered a problem where the LLM's response stopped midway. This was because LLMs have limits on input and output tokens (how much info you can send out and the size of the response it will send back). The input token limit is so high that I'm not likely to have enough headlines to even remotely approach it's limit (Gemini is 1 million tokens for the free tier and 2 million tokens for the paid tier and I'll be using around 20,000). The output limit is much smaller (~8k) so if I wanted it to send back the complete details of the filtered headlines (title and link) it won't be able to. To address this problem, I send the LLM the headlines with an index and have it return just the index. If it was a 100 headlines then size of the output is less than 200 tokens.

A cron job will execute the program and write the result as JSON to the server where the web page will load the JSON and display the headlines:

The code is on Github and you can see it is very simple program.

Sunday, January 26, 2025

Upgrading from Fedora 40 to 41

Decided it's time to upgrade from Fedora 40 to 41 and it was a smooth upgrade. Will update if I run into any issues, but I've already been running 41 on a laptop for a couple of months so I know that 41 works for what I need and this time it was to upgrade systems that was already running an earlier version.

Wednesday, January 1, 2025

Powerline Error in VIM with Git

On Fedora 41, opening a Go file that is checked into Git results an error message that flashes briefly before VIM shows the file. I'm not sure what is the exact cause but it went away when I uninstalled powerline and reinstalled it. When uninstalling just powerline, it also uninstalled VIM for some reason the re-install was actually `dnf install vim powerline`.

Saturday, December 28, 2024

Framework 13 (AMD) First Impressions

I recently purchased a Framework 13 (AMD) laptop to replace my 7 years old Razer Stealth as my travel companion and wanted to share my first impressions after taking it on a trip and using it for about a month.

My main criteria for a travel laptop is that it must be light, but as I've gotten older I've added a few more things to look for:

Display must support good scaling. As displays have improve their resolutions, everything shown has gotten smaller so I have to scale it up for it to be comfortable for my eyes.
Replaceable battery. As evident from using my previous laptop for 7 years and still not feeling the need to upgrade, I tend to keep my laptop for a long time especially since I don't rely on my laptop to be my primary driver. While most parts of a laptop are capable of lasting awhile, batteries are a different story. I've had to replace the battery on my Razer twice because they started to swell. This is not exclusive to Razer as I've had it happen on Macbooks and Pixelbooks as well.
Linux support. I mostly use Linux especially for development so I'm most comfortable with it, but I occasionally do have use for Windows (some games the family plays, camera apps, etc.). They key reason, though, is that Windows is commercial and I don't want to be forced to pay to upgrade if it is not necessary. The Razer Stealth ran Windows 10 and Microsoft says it's not compatible with Windows 11 so I either have to live without security updates or try to install Linux when Razer doesn't support Linux in anyway. Having good Linux support is a way to future proof the laptop somewhat.

Given these criteria, I settled on the Framework. It is lightweight (1.3 kg/2.9 lb) with a 13.5" 2880x1920 120Hz 2.8K matte display (the uniqueness of Framework is that it is modular so you can have different displays even if new ones are released in the future) that is a 3:2 ratio which allows better scaling especially with Linux.

The battery is replaceable (nearly everything on the Framework is replaceable/up-gradable) and it fully supports Linux (Ubuntu and Fedora being the officially supported ones but seems like most Linux distributions will work) even the finger print sensor.

First Impressions

Ordering the Framework, especially the DIY version, presents you with more choices than the typical ordering process. Not only do you pick how much disk storage you want, you have options for what brand/specs you want and this is for all the major components: display & bezel, keyboard, CPU,memory, storage, memory, ports. If you're familiar with computer components, most of the parts are understandable from the description but for the keyboard it didn't explain what the difference is between US-English, International-English Linux and International-English (International English has the Euro key and Linux swaps out the Windows key with a Super key).

I was impressed how quickly the laptop shipped and arrived given it that it comes directly from Taiwan (where it was manufactured) and that each order have different sets of expansion ports (although I did pick a pretty standard set of ports). It arrived faster then the 5-7 business days it listed when I ordered.

The packaging was nicely done to prevent anything from shifting during transit and everything is recyclable. The packaging include a screw driver needed to assemble all the components.

It took about 20 minutes to put everything together and the instructions were good. A lot of people can probably figure it out even without the instructions, but the instructions really prepares you. It suggests putting on the bezel starting at the bottom first and that definitely allowed it to fit better without fidgeting and it warn that the first boot will take longer so you don't worry about whether it not immediately starting up meant that you did something wrong.

It gives pretty good instructions on installing the operating system. For Windows, it anticipated that you might not want to use a Microsoft account so it tells you how to bypass it and how to deal with the fact that during installation the laptop drivers aren't there so how do you get pass the part where Microsoft wants to have networking working just to complete the installation. For Linux, the instructions was decent but maybe a little outdated especially with the screenshot. Although Fedora is one of the two officially supported distributions, the Ubuntu guides seemed more comprehensive. They also favor Gnome in their instructions.

You can get the Framework with either an Intel process/motherboard or AMD processor/motherboard and although the general sense that the AMD version performs better there's more information on the Intel stuff.

The display looks very nice with good contrast and brightness. It's very comfortable to the eyes and scaling in Linux was not a problem. No complaints about the touch pad and it worked with Linux out-of-the-box. The keyboard is comfortable with good travel and spacing. It wasn't too squishy. If the Thinkpad is the bar, this isn't as good as that, but better then the last Macbook Pro I used .

The fingerprint sensor worked out-of-the-box as well, but if you aren't using Gnome, you need to use the command-line tool, fprintd-enroll, to register your finger print.

It's not clear whether Framework thinks you should run tuned-ppd to manage power profiles for both Intel and AMD or whether that's just for Intel and to stick with power-profiles-daemon for AMD. On Fedora 41, if you install power-profiles-daemon then each time it wants to change the profile (such as when you plug/un-plug the power) SELinux will block it and give you a warning.

Although I had no problems with WIFI, the wifi chip it comes with always seem to have a weaker signal than other devices. I think some people swap it out with the one that comes with the Intel board so it's something I'm watching out for.

I've been pretty happy with the laptop so far and hope it'll last a long time. I like the company's mission and hope they continue to succeed with their vision of modular, sustainable and environmentally friendly laptops.

System Configuration

AMD Ryzen™ 5 7640U
2880x1920 120Hz 2.8K matte display
Crucial RAM 32GB Kit (2x16GB) DDR5 5600MHz
WD_BLACK 2TB SN850X NVMe SSD Solid State Drive - Gen4 PCIe, M.2 2280, Up to 7,300 MB/s

Monday, December 16, 2024

Using Physical Key for SSH, Git, Github For More Security

Normally, when using SSH (including SSH with Git), you generate a private and public key. The public key is what you give to others (e.g. Github). The private key should be kept secure and not be shared with anyone. I don't like keeping my private keys on my laptop because mobile devices have a higher chance of being lost, stolen, or unknowingly accessed. One solution is use a physical security key to store the private key that is plugged in to the laptop when needed.

To set this up requires having a security key such as the Yubikey from Yubico. Then it is a matter of generating a key pair with SSH:

> ssh-keygen -t ecdsa-sk  # -t ed25519-sk is also an option but not always supported

This will generate the private key on the security key. The generated id_ecdsa_sk file in the SSH directory is just a reference to the security key instead of the normal private key. The id_ecdsa_sk.pub is the public key that you would share. Whenever ssh needs to authenticate, the key will blink and with a tap of the key you'll be good to go!

For each computer that you want to use the key, you'll need to copy the reference key file to the SSH directory.

Pages

Tuesday, March 25, 2025

You Are the Agent

Using an AI Assistant

Method 1

Method 2

Method 3 (MCP)

Comparison

Shortcomings of the MCP

Friday, March 21, 2025

Tuesday, February 18, 2025

Saturday, February 15, 2025

Setup

Keeping It Running

In Place of JavaScript

Monday, February 10, 2025

The First Prompt

The Second Prompt

The Third Prompt

Conclusion

Sunday, January 26, 2025

Wednesday, January 1, 2025

Saturday, December 28, 2024

First Impressions

System Configuration

Monday, December 16, 2024