Nearly Doubling My CPU's Clock Speed by Removing Complexity
How simplifying my instruction fetch path improved Fmax from ~25 MHz to ~45 MHz on my custom dual-issue CPU.
I'm Dulat. I like taking ideas all the way from scratch code to something that actually runs: custom CPU cores, emulators, tools, and backend services.
My custom CPU project: a 16-bit ISA with a 5-stage, in-order, dual-issue microarchitecture written in SystemVerilog, backed by a C/C++ emulator and assembler/linker.
Not everything I do is CPU design. I also write a lot of C/C++ and backend code, build small tools, and experiment with systems that sit closer to real-world use.
C/C++ and web backends for small services and apps, with a focus on clear interfaces and reliability over cleverness.
Utility scripts, CLIs, and helpers that support my CPU and systems work: testing harnesses, build tooling, and debugging helpers.
Cycle-ish level emulators to prototype ISAs and behavior before committing to hardware.
Long-term goal: tie my CPU and software work into real embedded/avionics-style stacks when the hardware is ready.
Roughly how a project goes for me, from idea to something I can actually run, test, and break.
Figuring out the ISA, architecture, or system boundaries on paper before touching code.
C/C++ for emulators and tools, SystemVerilog for hardware, plus small support programs when needed.
Writing tests, running programs end-to-end, and checking behavior against what I intended, not what I assumed.
Refactoring, simplifying, and fixing whatever falls over under real use.
Connecting pieces together: toolchains, small services, and (eventually) real hardware targets.
First full loop working: custom ISA, emulator, assembler/linker, and SystemVerilog microarchitecture reaching v0.1.0.
Improving the assembler/linker and general build flow so adding new instructions and tests doesn’t suck.
Started with simple emulators and small programs to get comfortable with instruction sets and control flow at a low level.
How simplifying my instruction fetch path improved Fmax from ~25 MHz to ~45 MHz on my custom dual-issue CPU.
A quick introduction to my new site built with Astro and Tailwind CSS.