Journey to a full x64 (dis)assembler [Part 3]

posted in Making of: Acclimate Engine

Published March 19, 2024

Welcome back. Last time, we talked about getting a disassembler, to view generated assembly code (https://www.gamedev.net/blogs/entry/2277765-journey-to-a-full-x64-disassembler-part-2/). This time, we are going to look at the framework for actually generating assembly.

What & why

In general, when you see assembly, you usually see the text-based disassembly form. C(++) compilers will often allow you to insert custom asm, by writing the instruction in text form:

asm("mov rax,1);

We specifically do not want to do that, as we are writing an actual compiler ourselves. While certain compilers seem to use an intermediate text-based form for their code-generator, before compiling to machine-code, this is not necessary for the requirements of the language that I have. And I really don't want to deal with text-parsing - plus, it would have a considerable overhead. So, what we want is an actual framework to programmatically generate assembly.

What we had

As I said, I did write an assembler already, but without really understanding all the intricacies of the language. So what I had, looked something like this:

void JitGenerator::WriteConvert(bool toInt)
{
	static constexpr uint8_t OPERAND_SIZE = sizeof(int);
	const auto reg = m_impl.WriteLoadCallStackTop(RegisterType::RAX);

	if (toInt)
	{
		m_impl.WriteStoreRegister(reg, RegisterType::XMM0, OPERAND_SIZE);
		// TODO: only use ECX
		m_impl.WriteConvertToInt(RegisterType::XMM0, RegisterType::RCX);
		m_impl.WriteStoreRegister(RegisterType::RCX, reg, OPERAND_SIZE);
	}
	else
	{
		m_impl.WriteConvertToFloat(reg, RegisterType::XMM0, OPERAND_SIZE);
		m_impl.WriteStoreRegister(RegisterType::XMM0, reg, OPERAND_SIZE);
	}
}

This is the generator for an instruction, that converts an int to float, or vice versa (not bit-wise, but actually going 1 <=> 1.0). Since our language is stack-based, we actually need to load the top of that stack (that's a shorthand for push+pop, which would be less efficient), then store it into a register (if you don't know assembly; register is a field directly inside the processor to store bitwise data, which is was cheaper to operate on than memory), invoke a cpu-instruction to do the conversion, and then store it back into memory.

Now while that code worked, it had several issues. Since I didn't really know who registers are supposed to work, or the general encoding-rules for instructions, writing this thing is a free-for-all. Effectively, WriteStoreRegister could take any register, at any slot, and any operandSize (which is the size of any potential memory-target), in any combination - be it valid or not. We could have easily written:

m_impl.WriteStoreRegister(RegisterType::AL, RegisterType::XMM0, 256);

Which would attempt to store the content of an 8-bit register (AL), inside a 16-byte register (XMM0), with a memory-operand size of 256 bytes. Nothing about that works. You cannot copy from an 8-bit register to an XMM-register; no operation exists to copy 256 byte at once; and using a memory-operand size for a register→register mov makes no sense in and of itself. But, this could would have compiled, and would only trigger an internal assert if it is run (if I was luck to catch that case). That's pretty bad, as certain instructions and paths in the code-gen are rare.

What I actually want is a framework, that catches those types of thing already when writing the generator-code, and trigger a compile-error. Luckily, in C++ especially, we can do that kind of thing.

Changing things up

Now, I already said before that I wanted to use the new disassembler-framework to create the assembly as well. We already can define instructions, and argument-types. We now only need a way to generate them. So we are making an “AssemblyGenerator”, with methods to generate all the potential instructions. Let's look at a simpler case, of pushing a register:

void AssemblyGenerator::WritePush(Register64 reg)
{
	AddInstruction<PushInstruction>(reg);
}

So, the PushInstruction is the same as for the disassembler, and Register64 is a type that we defined. We are now making a very clear destinction between different register-sizes. While “EAX” and “RAX” are technically addressing the same register, they have different sizes, and can be used in different contexts. So, we can now call this method:

generator.WritePush(Register64::RAX);

And it will only compile if an actual 64-bit register is passed to it.

For more complex cases, we introduce an param-struct, to represent the potential combination of parameters:

struct WriteRMImmParams
{
	WriteRMImmParams(Register8 target, uint8_t imm) noexcept :
		rm(target), imm(imm) {}
	WriteRMImmParams(Memory8 target, uint8_t imm) noexcept :
		rm(target), imm(imm) {}
	WriteRMImmParams(Register32 target, uint32_t imm) noexcept :
		rm(target), imm(imm) {}
	WriteRMImmParams(RegisterMemory32 target, uint32_t imm) noexcept :
		rm(target), imm(imm) {}

	WriteRMImmParams(Register64 target, uint32_t imm) noexcept :
		rm(target), imm(imm) {}
	WriteRMImmParams(RegisterMemory64 target, uint32_t imm) noexcept :
		rm(target), imm(imm) {}

	RegisterMemoryVar rm;
	ImmVar imm;
};

This is a parameter-set, that supports operations which can write an immediate-value (integer), to eigther a register or a memory-address. The potential permutations are hardcoded here. So, once again, we can call a certain method, only with supported combinations of args:

generator.WriteAdd({Register8::AL, 0}); // compiles
generator.WriteAdd({RegisterXmm::XMM0, 0}); // doesn't

This unfortunately requires all calls with more than one parameter, to use {}-brackets around it. But those param-structs are reused between many commands, and I do not want to have to specify all potential combinations for all “WriteX” methods.

Now, we are again using templates to allow casting registers around:

generator.WritePush(FUNC_REG_0.As<uint8_t>());

FUNC_REG_0 is defined as RegisterType64::RCX, the first register where function-arguments are passed to a function on x64. By specifying the actual type of register, we can request a specific size.

Memory-operands also have a size. They generally only work with a 64-byte register, but they specify an offset to the address stored in that register:

generator.WriteMov({ RegisterType64::RAX, { RegisterType64::RCX, 0}}); // load first 8 bytes of ptr stored in RCX

The cool thing is, that similar to writing assembly-files yourself, as long as the operand-size is clear, you do not have to specify it. However, when the size cannot be automatically be determined, you can then specify it, once again with a template:

generator.WriteMov({ RegisterType64::RAX, RETURN_REG.MakeMemoryAs<uint64_t>(0)); // load first 8 bytes of ptr stored in RAX

Generally, we have lots of different types: RegisterX, MemoryX, RegisterMemorX; RegisterMemoryGeneric (untyped memory) - all of the different types that could be used by instructions. That allows us to both generalize code, as well as have it be robust, without having to check all code-paths.

Wrapping it up

This ones a bit shorter. I think I have explained everything about the assembler. The new framework allowed me to easily refactor and improve the existing code-gen. The initial instruction for converting int and float now looks like this:

void JitGenerator::WriteConvert(bool toInt)
{
	const auto [source, target, isConversion] = WriteLoadCallStackTopToModify<int>(0);

	if (toInt)
	{
		m_impl.gen.WriteMovss({ RegisterTypeXmm::XMM0, source });
		m_impl.gen.WriteCvttss2si({ RegisterType32::ECX, RegisterTypeXmm::XMM0 });
		m_impl.gen.WriteMov({ target, RegisterType32::ECX });
	}
	else
	{
		m_impl.gen.WriteCvtsi2ss({ RegisterTypeXmm::XMM0, source });
		m_impl.gen.WriteMovss({ target, RegisterTypeXmm::XMM0 });
	}
}

I've also decided to use the names of the actual instructions, and not some abstracted wrapper (as that kind of code-gen would need to be specialized for different target-platforms anyway). The code here also already includes a feature of the fully nativized runtime (where target and source for the current stack could be different) - but I'll maybe talk about more of the details of the actual language in a later entry. I'll also briefly cover some related topic, like writing the VisualStudio-debugger extension.

Thanks for reading.

Previous Entry Journey to a full x64 (dis)assembler [Part 2]

Next Entry General structure and design of the Acclimate Engine [Part 1]

0 likes 0 comments

Comments

Nobody has left a comment. You can be the first!

You must log in to join the conversation.

Don't have a GameDev.net account? Sign up!

Juliean

Author

Journey to a full x64 (dis)assembler [Part 3]

Comments

Juliean

Latest Entries

General structure and design of the Acclimate Engine [Part 3]

General structure and design of the Acclimate Engine [Part 2]

General structure and design of the Acclimate Engine [Part 1]

Journey to a full x64 (dis)assembler [Part 3]

Journey to a full x64 (dis)assembler [Part 2]

Journey to a full x64 (dis)assembler [Part 1]

[C++] ArrayView-class

Designing a high-level render-pipeline Part 2: Views & passes

Designing a high-level render-pipeline Part 1: The Previous state

Designing a high-level render-pipeline Part 3: A visual interface

Journey to a full x64 (dis)assembler [Part 3]

Comments

Juliean

Latest Entries

General structure and design of the Acclimate Engine [Part 3]

General structure and design of the Acclimate Engine [Part 2]

General structure and design of the Acclimate Engine [Part 1]

Journey to a full x64 (dis)assembler [Part 3]

Journey to a full x64 (dis)assembler [Part 2]

Journey to a full x64 (dis)assembler [Part 1]

[C++] ArrayView-class

Designing a high-level render-pipeline Part 2: Views &#38; passes

Designing a high-level render-pipeline Part 1: The Previous state

Designing a high-level render-pipeline Part 3: A visual interface

Reticulating splines

Designing a high-level render-pipeline Part 2: Views & passes