RSS Daily tech news
  • Scientists confirm one-dimensional electron behavior in phosphorus chains
    For the first time, researchers have shown that self-assembled phosphorus chains can host genuinely one-dimensional electron behavior. Using advanced imaging and spectroscopy techniques, they separated the signals from chains aligned in different directions to reveal their true nature. The findings suggest that squeezing the chains closer together could trigger a dramatic shift from semiconductor to […]
  • A tiny light trap could unlock million qubit quantum computers
    A new light-based breakthrough could help quantum computers finally scale up. Stanford researchers created miniature optical cavities that efficiently collect light from individual atoms, allowing many qubits to be read at once. The team has already demonstrated working arrays with dozens and even hundreds of cavities. The approach could eventually support massive quantum networks with […]
  • A strange in-between state of matter is finally observed
    When materials become just one atom thick, melting no longer follows the familiar rules. Instead of jumping straight from solid to liquid, an unusual in-between state emerges, where atomic positions loosen like a liquid but still keep some solid-like order. Scientists at the University of Vienna have now captured this elusive “hexatic” phase in real […]
  • New catalyst makes plastic upcycling 10x more efficient than platinum
    Scientists are finding new ways to replace expensive, scarce platinum catalysts with something far more abundant: tungsten carbide. By carefully controlling how tungsten carbide’s atoms are arranged at extremely high temperatures, researchers discovered a specific form that can rival platinum in key chemical reactions, including turning carbon dioxide into useful fuels and chemicals. Even more […]
  • Engineers just created a “phonon laser” that could shrink your next smartphone
    Engineers have created a device that generates incredibly tiny, earthquake-like vibrations on a microchip—and it could transform future electronics. Using a new kind of “phonon laser,” the team can produce ultra-fast surface waves that already play a hidden role in smartphones, GPS systems, and wireless tech. Unlike today’s bulky setups, this single-chip device could deliver […]
  • An old jeweler’s trick could change nuclear timekeeping
    A team of physicists has discovered a surprisingly simple way to build nuclear clocks using tiny amounts of rare thorium. By electroplating thorium onto steel, they achieved the same results as years of work with delicate crystals — but far more efficiently. These clocks could be vastly more precise than current atomic clocks and work […]

Automating Internal Links – SEO

by Florius

As the creator of my website, you’d be surprised by the amount of emails I receive, asking if they can SEO-optimize my website for me, for a certain price. So in the last few month, I started looking at what is actually wrong with my website. One of the issues were internal links, where you link one article to another of your own website. I’ve done this on many occasions before when writing on similar topics roughly around the same time. But If the time between two articles spans a year or two, I normally don’t go back to update the old one anymore, but I do try to encorporate it in the article I am writing on. 

Table of Contents

In this article I want to talk in a short section on why internal links can optimize your search engine results. After that I want to explain what was probably one of the larger issues I had, namely broken links. It is too tedious to go through every page yourself and checking every link. I’ll go into details on how I tackled this problem. In the last section, I looked at improving my internal links. For this part I used machine learning to compare different articles and find out which articles should be linked together.

Are Internal Links Good for SEO?

The short answer: Yes. The long answer is still yes and for several reasons. Internal links help search engine bots, like Googlebot, crawl through your site. If a page isn’t linked from anywhere, it might never be discovered or indexed. They also help define your site’s structure, making it clear which pages are most important and how content is related. This organization helps both search engines and visitors.

Internal links also improve user experience. It helps readers find related content naturally, keeping them engaged longer and lowering bounce rates, both of which are positive SEO signals. Plus, when one page performs well, internal links can pass some of that authority (called link equity) to other pages, helping them rank better too.

On the flip side, broken links, links that point to non-existent or deleted pages (404 errors), can harm your SEO. Google sees this as poor maintenance and might rank your site lower because of it. That’s why regularly checking and updating internal links is just as important as creating them.

Broken Links and How I Fixed Them with Python

Broken links can seriously affect your website’s SEO. We have all encountered those 404 “Page Not Found” errors before, and most of us probably left the site to find another one that had the information we were looking for.

In my case, the issue started when I changed how my URLs were structured. They now follow the format https://www.florisera.com/name_of_my_article/.  This change caused many of my older posts to contain broken links. Manually finding and fixing them would take far too much time, so I decided to create a Python script that automatically detects and records them for me. The full code can be found on my github.

A flowchart shows “Sitemap → Crawler → Load CSV.” From “Next page?” a No branch goes to “Done.” The Yes branch goes to “Single page HTML → Check Links → 404?”. If 404 is Yes, the link is added to “Broken links” and the flow returns to “Next page?”. If 404 is No, it also loops back to “Next page?”, repeating until finished.
Figure 1. Broken-link checker workflow: crawl sitemap, paginate through URLs from CSV, fetch each page, check links, log 404s, repeat until no next page, then finish.

In Figure 1, you can see the process flow I designed. It begins with my website’s sitemap, which contains a list of all article URLs. These URLs are stored in a single CSV file that acts as a small database where I can extract or add data. Using the Python library BeautifulSoup, the script parses the HTML of each article and extracts all internal links (stored as <a href="">).

At this point, I have a complete list of internal links, and I just need to verify if each one still works. For this, I use a HEAD request instead of a GET request, since I only care about whether the page exists and not about downloading its content. If no broken links are found, the script moves on to the next article. If it does find any, it logs them in the CSV file and continues until all articles have been checked. Once every page is processed, the script finishes.

Getting to this stage, I found no easy method to replace the broken links with the list of working ones. Luckily I only had to fix a total of 20 broken links on my entire website, and this was still managable. 

Recommending Internal Links

The other thing I looked into was how to improve internal linking. I usually have an intuitive sense of which articles relate to each other, and with enough time I could manually link them all. However, since I was already using Python scripts, automating this process would be far more efficient and scalable. 

Just like in my earlier web-crawling step, I reused my website’s sitemap containing all article URLs and scraped each page’s metadata using BeautifulSoup. For every article, I extracted the following fields:

  • Title (<title> or og:title)
  • Meta description (<meta name="description">)
  • Keywords (from your “tag cloud” section)
  • Excerpt (tries to parse from Elementor JS data)

After gathering all this information, I combined the title, excerpt, meta description, and keywords into a single text block per article. Computers, unlike humans, don’t actually “understand” text, they represent it as vectors in a high-dimensional space. Large Language Models (like ChatGPT) and smaller embedding models (like SentenceTransformers) convert text into these vector embeddings.

Semantical Cosine Similarity

A three-column table titled “source_url, target_url, similarity_score” with heat-colored scores (green=high, red=low). Rows show suggestions like /3-phase-ac-systems/ linking to articles on AC motors, brushed vs. brushless motors, IGBT inverters, energy transfer in DC/AC, MOSFET vs. IGBT, capacitors, PWM in PIC16F877A, and IEEE reference style, with scores from ~0.22–0.56. Another block shows /dissertation-research-results-tips-and-example/ linking to dissertation pages (conclusion, methodology, discussion, introduction, overview, appendix, literature review, preface) with higher scores ~0.61–0.86.
Figure 2. Internal-link recommendations ranked by semantic similarity; greener cells indicate higher scores between source and target URLs.

In my case, I used the all-MiniLM-L6-v2 model from SentenceTransformers to generate embeddings for each article’s text block. I then calculated the cosine similarity between every pair of articles to measure how semantically close they are to each other. For each article, I selected the top 8 most similar posts (excluding itself, of course) and stored these results in a separate data file as shown in Figure 2. I color-coded them, where green shows a high similarity between the source URL and the target URL. Red on the other hand shows a very low similarity. In my example, for the 3-phase AC system, I only have a few related articles. While for the “dissertation research results”, almost all the other chapters are a high match, which is obvious. 

Visualization of Link Network

From this database with similarity scores, I wanted to visualize my internal link recommendations in a network graph. A typical graph in python is made with the modules NetworkX and PyVis. The graph I created is shown in Figure 3. Each node is an article and arrows are drawn between similar articles. I did filter our weak links, which I set at a similarity ≥ 0.3. 

A large node-link graph showing many small blue nodes connected by curved blue edges. Three colored outlines highlight communities: a blue cluster at top-right (electronics topics), an orange cluster at left (research/writing posts), and a green cluster at bottom-right (spintronics). Dense areas indicate strongly interlinked posts; a few tiny subgraphs sit around the periphery with fewer connections.
Figure 3. Internal-link network of my site, clustered into three themes: Electronics (blue), Research & Writing (orange), and Spintronics (green). Nodes are posts; lines are suggested links.

From the graph, I highlighted three main topics that correspond perfectly with the sections featured in my website’s header, which is of course no coincidence. Each topic can be further divided into two smaller sub-groups like this:

  • Blue: Electronics
    • Tutorials on PIC16F877A
    • Motors, AC vs DC, IGBTs
  • Orange: Research and Writing
    • Citation and References
    • Dissertation
  • Green: Spintronics and Physics
    • CMOS, Scaling, SC Technology
    • Ferromagnetism, Majorana, Physics
  • Free-floating:
    • Social Media 
    • Resistor Color (Change)
    • Tips and motivational
    • HSV paint

And finally, a few less categorizable topics like Links, CV, and History pages.

If you are interested, I’ve uploaded the results here, but it might look slightly different than Figure 3, depending on the parameters you set.

Conclusion

When I started this article, I initially thought it might be possible to automatically insert links into my older posts. However, I decided to let that idea go, since I wouldn’t really know how to program something that finds the right spot in a text and adds a link in a natural way (perhaps an LLM like ChatGPT can do it, but then it needs access to my website). Instead, I tried doing it manually, with the dataset at hand. I managed to go through a few articles, but it turned out to be a rather tedious process, so I stopped about halfway. Still, it’s a useful technique to keep applying in future posts, especially while I’m already writing the article and the context is still fresh.

Florius

Hi, welcome to my website. I am writing about my previous studies, work & research related topics and other interests. I hope you enjoy reading it and that you learned something new.

More Posts

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.