RANDOM BITS

A random site by a random clueless human
Random bits of programming, math, and thoughts By a clueless human          Random bits of programming, math, and thoughts By a clueless human

Complete List


The Sign of Char in ARM

January 15, 2025

micro   arm   C

The following below has a value that is vague:

char i = -1;

The issue with the above line is that the value of i is not immediately obvious as compilers for different architectures could treat this as signed or unsigned. The signedness of a data type can be simply thought as whether or not there is a dedicated sign bit that indicates whether or not the number is postive or negative.

In Robert Love’s section on “Signedness of Chars” (Chapter 19 - Portability) of his book on the Linux Kernel Development, he notes that on some systems such as in ARM would treat char as unsigned which goes against the logic of us AMD64 (x86-64) programmers. Effectively, the value of i will be stored as 255 rather than -1. The reason for this is apparently due to performance.

Let’s verify this on my Raspberry Pi 4 machine running Linux:

char i = -1;
if (i == 255) {
    printf("char is unsigned\n");
}
if (i == -1) {
    printf("char is signed\n");
}

Result: char is unsigned

Therefore to make your code portable, one should ensure to explicitly state whether or not char is signed or unsigned instead of making assumptions if they know their char will lie outside of 0 to 127.

I heard rumours that Apple may treat the signedness of char as signed instead but I unfortunately do not have access to an Apple ARM development machine to verify this myself.

Maybe I’ll examine this more in my blog by examining what is going on under the hood and compare this with amd64 and whether or not QNX makes this change as well.

Note: I purposely omitted how signed bit works and the 2’s complement


A First Glance at Raspberry Pi 4 Running QNX 8.0

January 4, 2025

micro   qnx   rpi

I recently purchased a Raspberry 4 to fiddle around with QNX 8.0 that has been released to the educational and hobbyist community. Being frustrated with running QNX virtual machines via Momentics, I decided it would be much easier to run QNX on a piece of hardware. I do not know if it is just me but I find my newly created virtual machines have a one-time boot lifespan. Any attempts to restart my virtual machines via Momentics, I would have network related issues and therefore would not be able to connect to the virtual machine remotely. This means that I cannot transfer files. I long forgotten how to create QNX image and run them on QEMU via commandline so I took the opportunity to buy myself a raspberry Pi (tbh I was just being lazy to resolve my issues).

Anyhow, enough idle chatter. Upon booting QNX on the Raspberry Pi, you’ll be first greeted with what appears to be demolauncher (though I did not confirm this):

QNX Welcome Screen on Raspberry Pi

Honestly this screen is annoying because I have to connect a mouse to simply open a terminal. Once authenticated, the terminal will greet you with a message along with a few bundled samples you can try:

Welcome to QNX 8.0!

This QNX 8 target is now ready for your QNX applications. There are
some sample applications included here to help you get started with
development. Find the source code for most of them at gitlab.com/qnx.

Here's how to run some of the bundled samples:

|------------------|----------------------------------------------------------|
| $ gles2-gears    | Displays hardware-rendered content using OpenGL ES 2.x.  |
| $ gles2-maze     | Shows how to use texture, vertex, and fragment shaders.  |
| $ vkcubepp       | Demonstrates 3D rendering capabilities using Vulkan.     |
| $ camera_example | Displays a simulated camera signal or live camera feed.  |
| $ st             | The default terminal. Run it to open a new instance.     |
| $ blinq          | A simple web browser. Make sure your system date is set. |
|------------------|----------------------------------------------------------|

(You can use ALT-TAB to switch between windows.)

Have questions? Find the community on Reddit at r/QNX, ask on StackOverflow,
or log an issue at https://gitlab.com/qnx .

Below are some of the bundled samples mentioned above along with some additional samples that I found on the image

The sample renderings are for illustration purposes only. To post them on the internet within a resaonable size, I had to significantly reduce the quality of the recordings:

3 gears rotating and spinning first person view of going through a maze

gles2-gears or gless3-gears

gles2-maze

a camera view going around a red teapot a cube spinning

gless2-teapot

vkcubepp

a simulation of SMPTE color bars when the TV has no signal

camera_example: brings back memories of SMPTE color bars from the old days

As one can observe, most of these samples demonstrate graphical rendering using either OpenGL 2 (and 3) or Vulkan. I do think it is quite a neat demo but it does not demonstate QNX’s potential and abilities. It is hard to demonstrate QNX’s abilities without going into the weeds and pushing the OS. I did not fiddled around with QNX much so to be honest, these graphical demos surprised me as I haven’t seen QNX having anything graphical aside from the QNX 1.44MB Demo disk and QNX 6.5. I am more used to the plain old terminal experience I had with QNX 7 and 7.1.

st is a useful command to spawn a new terminal without going to the awful welcome screen and manually press the button to spawn a new terminal. As much as I dislike the welcome demo screen, the page demonstrated to hobbyists that QNX raspberry Pi image has graphical capabilities that they can utilize for their own side projects. The blinq browser as stated in the description is indeed a simple web browser.

There is nothing impressive about the web browser, as advertised, it is a bare bone web browser that can be quite slow. Weirdly enough, the web browser is not able to immediately report that a webpage does not exist. I do not know if that is an issue with the browser itself or the default network settings because it would load a non-existing page forever (well to be more precise, I gave up waiting).

A look into Blinq Web browser

Unsurprisingly, blinq is a Chromium based browser that is severely out of date.

Blinq

There are a good number of useful binaries bundles in the raspberry pi image by default:

a listing of some of the binaries available on the system

One of the features I greatly appreciate is the screenshot and vncserv utilities to take screenshots and access the machine remotely with graphical interface respectively. Hopefully we can see some interesting projects from the hobbyist community in the near future.


New Laptop: Framework 16

December 29, 2024

Ever since Linus Tech Tips (LTT) introduced Framework, a repairable and modular laptop, back in 2021, I always wanted one for myself. I always loved the idea of modular electronics ever since PhoneBloks introduced their idea of modular phones. Electronics that are modular are usually highly repairable due to the fact that one can easily swap a faulty component with a new component instead of going to a repair shop or dumping the phone into the garbage. The appeal of bringing the desktop experience of being able to upgrade various parts such as the CPU, RAM and storage to the laptop was very appealing. Electronics of the past were much easier to repair and upgrade but these days laptops are designed to not be easily upgradable such as the use of soldered RAM. Laptops are also designed to not be as repairable as it once was with the use of integrating more components into the SoC which allows manufacturers to significantly design a more compact and sleeker device. There are lots of benefits of SoC than just compactness, it also can help with power efficiency and speed as it can be optimized to have fast access to both the CPU and memory. While there could be engineering reasons to soldered RAM, it is likely to also encourage consumers to purchase a new laptop instead.

An image of the Framework laptop

A Framework laptop and its various parts. Source: Framework

The Framework laptop is great but every criticism you have heard about the Framework laptop holds true. Cost is the biggest issue with Framework laptops. As Framework is a small company, it cannot build in scale unlike the other OEMs. You will be paying an extremely hefty price to obtain a modular laptop. You could get a laptop from other OEMs with better specs for way less than what Framework offers. The laptop is not suitable for the regular consumers and is way more expensive than a luxurious laptop (aka Macbooks). There are other issues with the Framework laptop but I consider this to not be the cost of Framework but rather the cost of modularity. As I mentioned earlier, there are tradeoffs between modularity and integrating everything into an SoC. When you are getting a Framework laptop, you are buying the laptop for its modularity and repairability. For instance, when you buy a Framework 16 for instance, you can see the outlines of the various sliders around the keyboard and touchpad. In addition, you can clearly see the outlines of each expansion card on the laptop.

On a very positive note, you can swap the expansion cards to fit your needs and for those who care about colors, you can easily swap the colors of the screen bezel and the panels surrounding the keyboard such as adding a numpad, swapping the keyboard for an RGB keyboard, or getting an LED matrix panel. The flexibility to change the expansion cards was the biggest appeal of the laptop for me as you get to choose which IO ports will be HDMI, USB-As, or USB-Cs (with some restrictions).

I should keep this more brief as this is a microblog … Anyhow, now that I have access to my first dedicated GPU, I can now play video games that isn’t Minesweeper, Solitaire, Starcraft (Broodwar) and PC ports of old games like Final Fantasy 7. Ever since players were forced to move onto Counterstrike 2, I was no longer able to play CounterStrike with my old Lenovo Gen 7 X1 Carbon laptop. I was surprised by how noisy the laptop can be when playing Counterstrike 2 though that is likely due to my inexperience playing videogames that requires a dedicated GPU (and I am playing on a laptop which is probably not the best idea if you want to play videogames). Here’s the specs:

$ neofetch
             .',;::::;,'.                zaku@fedora 
         .';:cccccccccccc:;,.            ----------- 
      .;cccccccccccccccccccccc;.         OS: Fedora Linux 40 (Workstation Edition) x86_64 
    .:cccccccccccccccccccccccccc:.       Host: Laptop 16 (AMD Ryzen 7040 Series) AJ 
  .;ccccccccccccc;.:dddl:.;ccccccc;.     Kernel: 6.11.4-201.fc40.x86_64 
 .:ccccccccccccc;OWMKOOXMWd;ccccccc:.    Uptime: 5 hours, 46 mins 
.:ccccccccccccc;KMMc;cc;xMMc:ccccccc:.   Packages: 2254 (rpm), 12 (flatpak) 
,cccccccccccccc;MMM.;cc;;WW::cccccccc,   Shell: bash 5.2.26 
:cccccccccccccc;MMM.;cccccccccccccccc:   Resolution: 1920x1080 
:ccccccc;oxOOOo;MMM0OOk.;cccccccccccc:   DE: GNOME 46.6 
cccccc:0MMKxdd:;MMMkddc.;cccccccccccc;   WM: Mutter 
ccccc:XM0';cccc;MMM.;cccccccccccccccc'   WM Theme: Adwaita 
ccccc;MMo;ccccc;MMW.;ccccccccccccccc;    Theme: Adwaita [GTK2/3] 
ccccc;0MNc.ccc.xMMd:ccccccccccccccc;     Icons: Adwaita [GTK2/3] 
cccccc;dNMWXXXWM0::cccccccccccccc:,      Terminal: gnome-terminal 
cccccccc;.:odl:.;cccccccccccccc:,.       CPU: AMD Ryzen 9 7940HS w/ Radeon 780M Graphics (16) @ 5.263GHz 
:cccccccccccccccccccccccccccc:'.         GPU: AMD ATI c4:00.0 Phoenix1 
.:cccccccccccccccccccccc:;,..            GPU: AMD ATI Radeon RX 7600/7600 XT/7600M XT/7600S/7700S / PRO W7600 
  '::cccccccccccccc::;,.                 Memory: 7192MiB / 31386MiB 

On OpenBlender Benchmark:

monster: 130.805407
junkshop: 85.742239
classroom:64.374681

Which is significantly better than what my X1 Carbon achieved (where higher numbers are better).


Utilizing Aliases and Interactive Mode to Force Users to Think Twice Before Deleting Files

December 29, 2024

I previously mentioned that I lost my file by accidentally overwriting my file using the cp command. This got me thinking as to why this would be impossible on my work laptop since I would be constantly bombarded with a prompt to confirm my intention to overwrite the file.

$ cp 2024-12-01-template.md 2024-12-30-alias-interactive.md
cp: overwrite '2024-12-30-alias-interactive.md'?

Commands like mv and cp have an interactive flag -i to prompt before overwriting the file. As seen in man 1 cp

-i, --interactive
              prompt before overwrite (overrides a previous -n option)

To force everyone at work to have this flag enabled, they made cp and mv an alias in our default shell configs:

alias cp="cp -i"
alias mv="mv -i"

Which you can also verify using the type command:

$ type cp
cp is aliased to `cp -i'
$ type mv
mv is aliased to `mv -i'

Stack Overflow: The Case of a Small Stack

December 20, 2024

micro   stack   qnx   C

Years ago I was once asked by an intern to debug a mysterious crash that seemed so innocent. While I no longer recall what the code was about, we stripped the program to a single line in main. Yet the program still continued to crash.

Source:

int main() {
    char buf[1024*1024*1024];
}

Result:

# ./prog-arm64 

Process 630803 (prog-arm64) terminated SIGSEGV code=1 fltno=11 ip=00000025333267f0 mapaddr=00000000000007f0 ref=000000443dd5dc50
Memory fault (core dumped) 

This bewildered all of the interns as it made absolutely no sense. Through our investigation, there was two things we noticed:

  1. The program worked on our local machines but not on our target virtual machine
  2. We were allocating an extremely large buffer in the stack which was unusual

It turns out the intern wanted to allocate a 1MiB buffer for some networking or driver related ticket. If I recall correctly, our target only had 512MB RAM so this could have explained the mysterious crash. But even 1MiB buffer on the stack was too large for our target:

Source:

int main() {
	char buf[1024*1024];
}

Result:

# ./prog-arm64 

Process 696339 (prog-arm64) terminated SIGSEGV code=1 fltno=11 ip=0000004de7e7a7ec mapaddr=00000000000007ec ref=000000383b19fbe0
Memory fault (core dumped) 

One thing I purposely omitted was that our target was running QNX, a realtime operating system. If we were to take a look at the documentation:

A process’s main thread starts with an automatically allocated 512 KB stack – QNX SDP 8.0 - Stack Allocation

This shocked all of us since 1 MiB is not a large buffer in 2021 where we had plenty of memory on our own personal system at home.

Note 1: The target used in the example was an aarch64le. This example will work on amd64 (x86_64) but requires you to add something such as a print statement

Note 2: QNX 8.0 was released to the general public in late 2023 or early 2024 so the actual target at the time when the question was asked was running either QNX 7.0 or QNX 7.1 (I do not recall which version)

The behavior for AMD64 (x86_64) as noted requires more fiddling to trigger a crash which came to my surprise. A slightly more detailed version will be released shortly on my blog which will include a very brief reason as to why AMD64 doesn’t crash if nothing extra is added like a call to puts.


Jekyll Cache Saving the Day

December 17, 2024

I was in the midst of publishing a post on announcing that QNX released a non-commercial license which allows hobbyist to fiddle around but I accidentally deleted my file using the cp command. This effectively killed my mood as I did not want to rewrite everything from scratch. I then recall that Jekyll creates a cache to speed up the build process when converting markdown to HTML.

$ ls -ld .?* 
drwxr-xr-x. 1 zaku zaku 204 Dec 16 23:47 .git
-rw-r--r--. 1 zaku zaku   0 Oct 20 19:55 .gitignore
drwxr-xr-x. 1 zaku zaku  32 Oct 20 19:56 .jekyll-cache

If we were to traverse into the cache and into Jekyll-Converters--Markdown, you’ll see a lot of directories labelled what it appears to be in hex:

.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown$ ls
0e  1c  22  24  2e  37  3f  44  47  53  57  5d  62  66  6e  74  7b  84  8d  90  91  9c  a7  a9  aa  ab  b1  b3  b6  c1  c6  cb  d4  d5  e1  e2  ea  f9  fc

Using my trust tool grep, I was able to patch up pieces of my work. However, as the purpose of Jekyll-Converters--Markdown is to cache markdown files that have been converted to HTML, I obviously had to clean it up a bit but regardless, it was much faster than to rewrite the entire article.


QNX is 'Free' to Use

November 9, 2024

micro   qnx

Recently on Hackernews, a relations developer from QNX announced that QNX is now free for anything non-commercial. QNX also made an annoncement to the LinkedIn Community as well which was where I learned about it. For those who are not familiar with QNX, QNX is a properiety realtime operating system targetted for embedded systems and is installed in over 255 million vehicles. QNX has a great reputation for being reliable and safe embedded system to build software on top of due to its microarchitecture and compliance to many industrial and engineering design process which gives customers the ability to certify their software in safety critical systems more easily. What makes QNX appealing is a discussion on another time but for me, this is a good opportunity to fiddle around with the system. I was previously denied a license from my university who had an agreement with QNX and my attempts to get an educational license did not go far years ago.

LinkedIn Post announcing QNX 8.0 has a non-commercial license

Previously to gain access to QNX, one would have to either purchase a commericial license from QNX or have an academic license. This made hobbyists from having access to the operating system. With the non-commericial license, QNX is now open for those who are interested in running a RTOS in their hobby projects and for open source developers to port their software on QNX. QNX is a POSIX compliant software but as QNX was not open for public use, companies had to port open source projects into QNX such as ROS (Robotics Operating System which isn’t an actual OS). QNX also mentions the non-commercial license allows one to develop training materials and books on utilizing QNX which is frankly scarce outside of QNX authorized materials (i.e. QNX training, Foundary27, and QNX Documentation).

A sample of what is allowed with a non-commercial license

While the announcement is welcoming news for me who would love to tinker around, this is yet another product entering the hobbyist community late. The reason for the success of UNIX, Linux, RISCV, and ARM is the ease and availability of the product to hobbyists and students who later bring this to their workplace or make the product better. Closing access to technology is a receipe for disaster in the long-term in terms of gaining market advantage. This is exactly the reason why we see cloud corporations enticing either the student or the hobbyist population to have free (limited) access to their products and even at times sponsor events targeted towards them. Linux, BSD, and FreeRTOS being open source makes them the dominant OS among the tinkering community and have wide adoption in the market. Over the years, we have seen a shift from customers using commercial and custom grade hardware and software towards more open source or off the shelf solutions including on critical safety applications such as those on SpaceX using Linux and non radiation hardened CPUs. IBM for instance has been late to developing an ecosystem of developers for their Cloud, Database and Power Architecture. IBM over the recent years has done a good job in creating free developer focused trainings which tries to make use of their own technologies. However, it is plain obvious that IBM has failed to capture mainstream interest of hobbyists who much prefer other cloud providers such as AWS, Google Cloud, Linode, and Digital Ocean. The SPARC and POWER architectures were open-source far too late by their own respective owners that developers have shifted towards RISCV and ARM as those architectures are either more open or easier to obtain (such as through Raspberry Pi Foundation).

While I have not done any sentimental analysis of this announcement, I think overall this move is a good first step to develop an ecosystem of developers who appreciate and understand the QNX architecture but is also met with sketpicism. For reference, QNX has messed with the community twice before which explains the big mistrust from experienced developers. The top comment on Hackernews does a great job summarizing the sketpicism. QNX used to have a bigger hobbyist community in the past where open source projects such as Firefox would have a build for QNX, but that all died when QNX closed their doors to the community. Years later, QNX source code was available for the public to read (though probably with restrictions) but later shut the source code availability after being acquired by Blackberry who does not have the best reputation to the developer community (hence why Blackberry Phones failed to capture the market from my understanding despite once being a market leader).

Regardless, I have plans to create a few materials on QNX in the coming months and perhaps create a follow up to QNX Adapative Partitioning System as it seemed to have gained enough has been ranked top 5 on Google search results (though I doubt it had many readers due to the population of QNX developers):

Google Search Result Ranking for my QNX APS webpage

Google Search Console from July 9 2023 - Nov 8 2024 which had 308 clicks


[Preview] Manually Verifying an Email Signature

October 8, 2024

micro   gpg   signing

I noticed that the neocities community love using protonmail and some even share their public key to enable full encryption communication. While I care about cyber security more than the average human, I do not care enough to start requiring others to start encrypting their email and sign their messages so that I can verify the authenticity of the messages I receieve.

Out of curiosity, I decided to see how one would manually verify the signature of an email to ensure that the email has not been tampered with and comes from the person who it claims to be. I won’t go into how digital signatures work as those details will be posted shortly after at my blog.

  1. Import Alice’s public key:
    $ gpg --import publickey-alice@proton.me.asc 
    gpg: key <redacted>: public key "alice@proton.me <alice@proton.me>" imported
    gpg: Total number processed: 1
    gpg:               imported: 1
    
  2. Download the email .eml file and the signature
    $ ls signature.asc 'GPG Signing test.eml'
    'GPG Signing test.eml'   signature.asc
    
  3. Extract the message to verify from .eml file

    This is where things get difficult. The downloaded email *.eml has a lots of unnedded information that needs to be discarded. I highly recommend that you make a copy of the email file because it does take a while to get used to.

    The content of the message starts after you see the following header (the hash will differ):

     This is an OpenPGP/MIME signed message (RFC 4880 and 3156)
     --------7005887d7abcdefgbe09e18825fd164103abcdefgf8c40b59382649cd69bc70a
    

    So for instance, let’s look at the following file:

     MIME-Version: 1.0
     Content-Type: multipart/signed; protocol="application/pgp-signature"; micalg=pgp-sha512; boundary="------3141887d7abcdefgbe09e18825fd164103abcdefgf8c40b59382649cd69b31415"; charset=utf-8
    
     This is an OpenPGP/MIME signed message (RFC 4880 and 3156)
     --------3141887d7abcdefgbe09e18825fd164103abcdefgf8c40b59382649cd69b31415
     Content-Type: multipart/mixed;boundary=---------------------ff35159c3ebf11234dd954191b3141592
    

    Then the first line of the signed message is:

     Content-Type: multipart/mixed;boundary=---------------------ff35159c3ebf11234dd954191b3141592
    

    Where the signed message ends is a scene of confusion. On the internet, there are many that says you to put everything between the first boundary and the second boundary into a new file. The boundary they are referring to is the line after This is an OpenPGP/MIME signed message (RFC 4880 and 3156) which has the form ----<hash>.

     --------3141887d7abcdefgbe09e18825fd164103abcdefgf8c40b59382649cd69b31415
    
     //email content
    
     --------3141887d7abcdefgbe09e18825fd164103abcdefgf8c40b59382649cd69b31415
    

    Despite my many attempts, I had no success till I realized you have to delete all trailing new lines. One thing I notice is that the hash on the first line of the signed message is also the last line in the signed message.

     Content-Type: multipart/mixed;boundary=---------------------ff35159c3ebf11234dd954191b3141592
    

    The first line of the signed file

    The hash on the first line of the signed message is: ff35159c3ebf11234dd954191b3141592 so our file should also end with this hash.

    If our message looks something like this:

     MIME-Version: 1.0
     Content-Type: multipart/signed; protocol="application/pgp-signature"; micalg=pgp-sha512; boundary="------3141887d7abcdefgbe09e18825fd164103abcdefgf8c40b59382649cd69b31415"; charset=utf-8
    
     This is an OpenPGP/MIME signed message (RFC 4880 and 3156)
     --------3141887d7abcdefgbe09e18825fd164103abcdefgf8c40b59382649cd69b31415
     Content-Type: multipart/mixed;boundary=---------------------ff35159c3ebf11234dd954191b3141592
    
     ...
    
     -----------------------ff35159c3ebf11234dd954191b3141592
     Content-Type: application/pgp-keys; filename="publickey - alice@proton.me - <redacted>.asc"; name="publickey-alice@proton.me.asc"
     Content-Transfer-Encoding: base64
     Content-Disposition: attachment; filename="publickey-alice@proton.me.asc"; name="publickey - alice@proton.me - <redacted>.asc"
    
     ABCDEF0x4ZjZkeGxSL0xUABCDEFmltotlUR0ABCDEFWaABCDEFE9PQP9ABCDEFAABCDEFtLUVORCBABCED
     ABCDEFEABCDEFFWSBCTE9DSy0tLABCDE==
     -----------------------ff35159c3ebf11234dd954191b3141592--
    
     --------3141887d7abcdefgbe09e18825fd164103abcdefgf8c40b59382649cd69b31415
    

    Then the signed message should be

     Content-Type: multipart/mixed;boundary=---------------------ff35159c3ebf11234dd954191b3141592
    
     ...
    
     -----------------------ff35159c3ebf11234dd954191b3141592
    
     ...
    
     -----------------------ff35159c3ebf11234dd954191b3141592
     Content-Type: application/pgp-keys; filename="publickey - alice@proton.me - <redacted>.asc"; name="publickey-alice@proton.me.asc"
     Content-Transfer-Encoding: base64
     Content-Disposition: attachment; filename="publickey-alice@proton.me.asc"; name="publickey - alice@proton.me - <redacted>.asc"
    
     ABCDEF0x4ZjZkeGxSL0xUABCDEFmltotlUR0ABCDEFWaABCDEFE9PQP9ABCDEFAABCDEFtLUVORCBABCED
     ABCDEFEABCDEFFWSBCTE9DSy0tLABCDE==
     -----------------------ff35159c3ebf11234dd954191b3141592--
    
  4. Verify the signature: gpg --verify signature.asc message.txt

     $ gpg --verify signature.asc message.txt 
     gpg: Signature made Mon 07 Oct 2024 11:29:48 PM EDT
     gpg:                using EDDSA key <redacted>
     gpg: Good signature from "alice@proton.me <alice@proton.me>" [unknown]
     gpg: WARNING: This key is not certified with a trusted signature!
     gpg:          There is no indication that the signature belongs to the owner.
     Primary key fingerprint: <redacted>
    

In practice, no one verifies the digital signatures of emails manually. Any sane individual will utilize any email client that would automate the verification process for them. This was a quick preview of a blog post I will be writing in the next few days that will go into email signatures in more details with better explanations and diagrams.


[Preview] Half-Width and Full-Width Characters

October 6, 2024

Those of us who live and speak English will probably never think about how characters are encoded which is how characters such as the very letters you see in the screen are represented by being given some number like 65 for ‘A’ in ASCII which takes 1 byte to be represented such as a char in C.

I was not aware of the existence of full-width and half-width characters till the friend asked me to briefly explain the highlevel information about the difference in representing the characters. For those like me who weren’t aware that the Japanese mix between zenkaku (full-width) and hankaku (half-width) characters, look at the image below or visit this webpage: https://mailmate.jp/blog/half-width-full-width-hankaku-zenkaku-explained

An image displaying the difference between full and half-width characters

Based on the article I shared, half-width characters takes up 1 byte while full-width characters takes up 2 bytes (also can be called double byte character). I do believe this depends on the encoding used. For me, the most obvious distinction between half and full width characters is how much graphical space it consumes as evident from both the image above and below:

Full and Half Width Characters encoded on UTF-8

Full and Half Width encoded on UTF-8 as seen through Vim

While I have read and typed Korean during my younger years when I was forced to learn Korean, it never clicked to me how much space Korean takes up graphically. It is obvious in hindsight but it was nonetheless interesting. Taking a look at the size and bytes encoding, we can see that number 1 in UTF-8 encoding takes 1 and 3 bytes for half-width and full-width character repsectively

$ stat -c "%n,%s" -- halfwidth-utf8.txt fullwidth-utf8.txt 
halfwidth-utf8.txt,1
fullwidth-utf8.txt,3

One confusion I had was understanding what the difference between UTF-8 and UTF-16 and the following excercise helped me understand this:

  • UTF-8 encodes each character between 1-4 bytes
  • UTF-16 encodes each characters between 2-4 bytes

UTF-8 and UTF-16 as you can tell are variable length meaning they take up more or less bytes depending on the character being encoded. We can see this by comparing the number 1 arabic numeral v.s. :

$ stat -c "%n,%s" -- halfwidth-1.txt chinese-1.md 
halfwidth-1.txt,1
chinese-1.md,3

In UTF-8, 1 takes up 1 byte which is unsurprising as ASCII has great advantage in UTF-8 compared to other Asian languages.

Note: Do not attempt to display UTF-16 encoded files on the terminal without changing your locale (or whatever it is called). It will not display nicely. Vim on my machine will automatically open the file as UTF-16LE.

My default terminal settings is unable to display the content in Chinese properly

Let’s inspect the contents of the files between Half character 1 and Full Byte Character in HEX:

$ cat halfwidth-1.txt; echo ""; xxd halfwidth-1.txt; cat fullwidth-1.txt ; echo ""; xxd fullwidth-1.txt 
1
00000000: 31                                       1
1
00000000: efbc 91                                  ...

As we can see, the half-width character 1 in UTF-8 is represented as 0x31 meaning only one byte would be required. However, a full-width digit is represented as 0xEFBC91. Now let’s compared this with UTF-16:

$ cat halfwidth-utf16.txt; echo ; xxd halfwidth-utf16.txt; cat fullwidth-utf16.txt; echo; xxd fullwidth-utf16.txt 
1
00000000: 0031                                     .1
�
00000000: ff11                                     ..

Note: To view UTF-16 on VIM run on command mode (i.e. press esc to exit current mode and press : to enter command mode): e ++enc=utf-16be fullwidth-utf16.txt

As expected, UTF-16 represents code points in the upper range very well where we now see (full-width 1) being represented with only 2 bytes unlike the 3 that was required in UTF-8. Though the same cannot be said for code points in the lower range such as our half-width digit 1 which now takes 2 bytes by appending 0x00 to its hex representation.

I will be writing a more detailed look into encoding at my blog in the coming days. This is just a quick preview.


Mixing Number and String

September 18, 2024

A recent post has gotten somewhat popular on the web and is something many of us could somewhat relate with. In the case of many including the author, the issue stems from how YAML treats strings and numbers. As a rule of thumb, I would always suggest avoiding any potential confusion by always adding the quotes around a string to ensure the value is treated as a string as intended. The crux of the post was how their Git commit inconveniently happened to be 556474e378 which is very rare to obtain. Recall that scientific notation is in the form of \d+(\.\d+)?E-?\d+ such as 8.5E-10 to refer to 8.5 x 10-10. The issue that one may encounter when mixing numbers and strings is that things can go very unexpected like the author did whereby 556474e378 was treated as 556474 x 10378. While I do not have any specific examples in mind when I have encountered such issues, I definitely have encountered this issue before where I mixed up a string and a number and obtained an undesired behavior. However, I do not think I ever encountered an issue where my numbers were interpreted as scientific notations.


`.` At The End of a URL

August 30, 2024

micro   dns   network

I recently learned that websites can be terminated with a . such as www.google.com. or https://neocities.org.. However, this does not work for all websites. I was skimming through Network for Dummies during work and while it doesn’t cover anything useful for the work I am trying to do (if you have taken a network course before, don’t bother reading this book unless you were bored like I was1), terminating a website with a . was a surprise.

The book states that If a domain name ends with a trailing dot, ..., and the domain name is said to be a fully qualified domain name (FQDN). The difference between an absolute name (FQDN) and relative name is important when working with DNS and can cause an “internet outage” if done incorrectly as one user on hackernews comments. Based on some article (linked by a stackoverflow user), websites that fail to handle . in their domain names are the ones who are in violation of RFC 1738 or at least not heeding to its recommendations.

Notes:

1 While Network for Dummies was actually fun to read surprisingly due to the author’s writing style, it lacks technical depth which should come to no surprise.


Splitting Pdfs into Even and Odd Pages

August 28, 2024

During the winter break I have obtained an old Xerox XE88 Workstation Printer released in the year of 2000, the year where the media were worried about Y2K causing havok to our digital infrastructure which never came to the scale we all feared thankfully. Though of course a bug will eventually will creep and wreck havok(i.e. Crowdstrike Falcon Update). But I digress, using this printer was filled with frustration as it is a relic from the past that is not meant to be used in 2024. Firstly, the printer requires a parallel port which no modern computer comes equip with. I have to drag out my last surviving desktop from my childhood that originally came with Windows Me that we immediately switched to the glorious Windows XP that we all know, love and dearly miss. As it turns out a few months later after my first use of the printer, my PS/2 connected mouse stopped working. I do not know if the PS/2 port is broken or if my PS/2 mouse is broken. I did manage to find another PS/2 mouse but as it was water damaged from a basement leak a few years ago, there was little chance it would work. Without a mouse made this task much harder, but I digress.

Parallel Port and connector

Parallel Port

PS/2 Port typically found in desktops from the 90s

Placing aside the hardware struggles to operate such printer in 2024, the printer does not have duplex printing so I had run commands on my pdfs on my Linux machine before transferring the files to my Windows XP machine (thankfully there are USB ports on this desktop that work or else I would have to dust off my 3.5 inch floppy disks and CDs). To split pdfs into even and odd pages turns out to be a very simple command:

pdftk A="${file}" cat Aodd output "${file}-odd.pdf"
pdftk A="${file}" cat Aeven output "${file}-even.pdf"

As I am printing a bunch of papers on Trusted Computing, I needed to split a lot of PDFs so this task can get quite tedious so I wrote a simple shell script:

for file in *pdf; do
  pdftk A="${file}" cat Aodd output "${file}-odd.pdf"
  pdftk A="${file}" cat Aeven output "${file}-even.pdf"
done

Executing Script Loophole

August 28, 2024

I recently came across an article discussing an attempt to close a loophole bypassing the normal execution permission bit. Exploiting a program suid and euid to gain high privilige is a commonly known technique called privilege escalation. This article does not cover this but it introduces a flaw in the current way Linux handles the execution of scripts. I do not know why privilige escalation came to my mind but as I usually write nonesensical things anyways, I shall keep it here for now. The article gives a neat example where a script does not have execution bit but is still executable by invoking the script via an interpreter.

$ ls -l evil-script.py 
-rw-r--r--. 1 zaku zaku 86 Aug 28 00:20 evil-script.py
$ ./evil-script.py
bash: ./evil-script.py: Permission denied
$ python3 evil-script.py 
Evil script has been invoked. Terror shall fill this land

As you can see, the script has no execute bit set. However, the script is still executable by feeding the script to the interpreter. I have never considered this a security loophole but after reading the article, I realized there are some concerns of allowing scripts to be executable bypassing the file’s permission. I have always made the habit of running many of the interpreted scripts non-executable and fed them to the interpreter due to laziness (I know it’s a one time thing to set the execute bit but I am just lazy to run chmod).

The article covers some promising approaches so I do expect a solution to be merged into the kernel sometime in the near future which will force me to change my habits once the interpreters make the change. Though if interpreters do make this patch, I do expect quite a few production and CI/CD servers to be impacted as there will always be someone like me who are lazy to set the execute bit on our scripts.

One benefit of closing this loophole is to force users to deliberately make the conscious choice to set the execute bit similar to how we have to set the flatpaks we download as executables (at least from my personal experience) before we can execute the flatpaks.


Replacing main()

August 24, 2024

micro   gcc   c

Any beginner C programmer will know that the first function executed in any program is the main() function. However, that is not the entire truth. Just like how we have learned the Bohr and Lewis diagrams in Chemistry in Highschool, this is an oversimplification. From my knowledge, the first function executed once the loader runs in a binary is _start().

Without going into any details, we can replace main() with another function such as foo() (sorry for the lack of creativity).

#include <stdio.h>
#include <stdlib.h>

int foo() {
  printf("Called foo\n");
  exit(0);
}

int main() {
  printf("Called main\n");
  return 0;
}

If we compile with -e <entry> where <entry> is the name of the function replacing main(), we can see the following results:

$ gcc foo.c -e foo
$ ./a.out 
Called foo

We can also observe from objdump and nm to see where the start_address of the C code is (here I am making a distinction between the first entry point of the C code and the binary).

$  objdump -f ./a.out | grep start
start address 0x0000000000401136
$ nm ./a.out | grep foo
0000000000401136 T foo

Few Notes

  1. You must define main() even if it’s not going to be used. CPP Reference states this explicitly:

    Every C program coded to run in a hosted execution environment contains the definition (not the prototype) of a function named main, which is the designated start of the program.

    Neglecting to define main results in an error like the following:

    $ gcc foo.c
    /usr/bin/ld: /usr/lib/gcc/x86_64-redhat-linux/14/../../../../lib64/crt1.o: in function `_start':
    (.text+0x1b): undefined reference to `main'
    collect2: error: ld returned 1 exit status
    
  2. The C program entry must call exit() to terminate if it is not main() or else a segfault will occur
    $ ./a.out 
    Called foo
    Segmentation fault (core dumped)
    

    a backtrace via gdb won’t give much information as to why. Probably best to consult with glibc. Essentially it is likely due to the fact that _start is not a function that returns in the stack. It calls exit to terminate the program which probably does some cleaning via atexit and set the exit status $? to some value.

    (gdb) bt 
    #0  0x0000000000000001 in ?? ()
    #1  0x00007fffffffdd46 in ?? ()
    #2  0x0000000000000000 in ?? ()
    
  • https://vishalchovatiya.com/posts/crt-run-time-before-starting-main/
  • https://www.gnu.org/software/hurd/glibc/startup.html
  • https://stackoverflow.com/questions/63543127/return-values-in-main-vs-start

Editing GIFS and Creating 88x31 Buttons

August 18, 2024

micro   gifs   gimp

Lately I have been learning how to edit GIFS and it is painstaking difficult to remove a background from a GIF without using an AI tool, especially when the image has over 70 frames. There is likely an easier way to edit GIFs but I had to manually edit over 50 frames, erasing the clouds from a GIF using the eraser tool frame by frame which took some time to finish.

Original:

Result:

However, if you are not editing a GIF but rather trying to incorporate the GIF into your 88x31 buttons, it turns out to be quite simple. Following the instructions from a video on Youtube, I managed to create a few simple 88x31 buttons that have features cats, coffee, and the two programs I am or finished studying (i.e. doing a 2nd degree):

To resize the gifs, I used ezgif resize tool to set the height to be 31px since I didn’t know how to resize GIFs on GIMP as it would open a bunch of layers. I guess I could have used ffmpeg but using an online tool is just more convenient and easier. I do wonder if there are other standard anti-pixel button sizes aside from 80x15 pixels because a height of 31 pixels is quite limiting. It’s amazing what the community has been able to do with such limiting number of pixels.

For instance, the Bash button I have made has the subtitle “THE BOURNE-AGAIN SHELL” which is very hard to make out. I am assuming the standard practice is to render the button as a GIF and display the text on the next frame. That way users would be able to see the full-text.


multiple definition of `variable` ... first defined here

August 10, 2024

micro   gcc   c

Randomly I decided to compile some old projects I worked on and I was surprised to see a few compilation errors in an assembler I wrote years back. As it has been many years since I last touched most of the projects I looked at, I was pleased to see the compiler catching obvious mistakes I had made in the past. Though this did come to a surprise as to why the compiler I used years ago never complained such obvious mistakes. The solution and reason for the last compilation error was not immediate to me:

$ make
gcc -o assembler assembler.c symbol_table.c parser.c  -fsanitize=address -lasan
/usr/bin/ld: /tmp/cc1MoBol.o:(.bss+0x0): multiple definition of `table'; /tmp/cc0B4XxW.o:(.bss+0x0): first defined here
/usr/bin/ld: /tmp/cc1MoBol.o:(.bss+0x81): multiple definition of `__odr_asan.table'; /tmp/cc0B4XxW.o:(.bss+0x40): first defined here

At first I thought I may had made a stupid mistake and defined the struct called table twice but all I could find was symbol_table.h, the file that declared the variable, being included by assembler.c and parser.c. This led to the conclusion there must have been a compiler behavioral change between GCC 9 and GCC 14. After a quick googling and going through going through the Release Notes, it turns out that starting from GCC 10, GCC now defaults to -fno-common:

GCC now defaults to -fno-common. As a result, global variable accesses are more efficient on various targets. In C, global variables with multiple tentative definitions now result in linker errors. With -fcommon such definitions are silently merged during linking.

In the Porting to GCC 10 webpage, the developers of GCC notes:

A common mistake in C is omitting extern when declaring a global variable in a header file. If the header is included by several files it results in multiple definitions of the same variable

To resolve this issue, one can either silently ignore their mistake and compile with -fcommon or to correctly declare the global variable with the extern keyword.


Delusional Dream of a OpenPower Framework Laptop

August 4, 2024

Framework is a company that makes modular and repairable laptops that has captured the interests of tech enthusiasts over the past 4 years. Currently Framework laptops are limited to x86-64 architecture supporting Intel and later AMD CPUs in 2023. Although Framework laptops are not entirely open source, they have open source a decent chunk of their work from my understanding and which allows third party development of components and makes partnership possible for other companies such as DeepComputing to release a mainboard that runs a RISC-V CPU . While the new mainboard will not be usable for everyday applications, it is a step forward to a more open ecosystem and this is an exciting step for both Framework, RISC-V and the broader open-advocate community. This announcement makes me wonder the possibility of OpenPower running on a Framework laptop. Similarly to RISC-V, there isn’t an easily accessible way to obtain a consumer product running on OpenPower (aside from Raptor Computing with their extremely expensive machines). There is the PowerPC Notebook project ran by a group of volunteers who are trying to develop an open source PowerPC notebook to the hands of hobbyists. It would be interesting if OpenPower community could also partner with Framework to develop a mainboard once the project is complete and the software is more matured. However, this would be a difficult step as there is no dedicated company like DeepComputing that will pour resources into making this happen. The interest into OpenPower is low and overshadowed by the wider industry interest in expanding the ARM and RISC-V architecture to consumers. IBM made a huge mistake in open sourcing the POWER architecture too late. But one could always dream (even if it’s delusional) :D


2024 Update

August 4, 2024

micro   site

Website

In the past year I have been very lazy as evident with my lack of activity on my personal blog. I'm now trying to pick up blogging again. It's hard to believe that it's been almost an entire year since I created this neocity site, which has almost 0 updates since. I've been thinking about how to use this site since I already have a blog on GitHub Pages. Honestly, I forgot this corner existed, and it’s been bothering me that I couldn’t write my random, nonsensical thoughts because my main blog wouldn’t be a suitable medium until now. So, I’ve decided that this corner will be a microblog where I can share random articles and thoughts. A microblog is different from a regular blog in that the content is much shorter. This space will allow me to quickly jot down something random. I hope that a collection of these random posts will evolve into a blog post or spark an idea for my final year thesis or project.

How are my studies going?

I’m still studying Mathematics, but I’ve lost much of my initial interest in the field after taking a few third-year courses. I ended up taking fewer Math courses, which puts my ability to graduate on time at risk. Listening to lectures and reading about abstract groups and rings made me really miss programming and computer science. Despite this, there were still some Math courses I enjoyed, such as Combinatorics and Real Analysis. However, I didn’t last long in the follow-up Real Analysis courses that covered Stone-Weierstrass and Commutative C* Algebra. Feeling tired of abstract Mathematics, I decided to take a break and pursue an internship at a telecommunications enterprise.

retro computer fiddling with excel What am I doing Now?

As mentioned, I am currently doing a year-long internship with a telecommunications enterprise. Although the job isn't very exciting, it's a welcome break from Mathematics. This would typically be a great chance to catch up on my Computer Science studies by delving into textbooks and online resources, but I’ve been quite lazy. Instead, I've been focusing on learning French, a language I've always wanted to master. I started learning French in elementary school, as it’s a requirement in Canada. While it might make more sense to learn my mother tongue, I’m opting to learn French, which might seem confusing to some. For context, I don't have an English name and was born in some Asian country but I am unable to communicate with others in my mother tongue.