Introduction to reverse engineering and Assembly.

I recently wrote a blog post giving an introduction to reverse engineering and assembly language on the Purism blog. Considering that my last blog post on my own website is from 3 years ago and this post is useful beyond the needs of just Purism, I thought it might have a nice home in my own personal blog as well, so here’s a copy paste of the entire blog post, as is.

 

 


 

Recently, I’ve finished reverse engineering the Intel FSP-S “entry” code, that is from the entry point (FspSiliconInit) all the way to the end of the function and all the subfunctions that it calls. This is only some initial foray into reverse engineering the FSP as a whole, but reverse engineering is something that takes a lot of time and effort. Today’s blog post is here to illustrate that, and to lay the foundations for understanding what I’ve done with the FSP code (in a future blog post).

Over the years, many people asked me to teach them what I do, or to explain to them how to reverse engineer assembly code in general. Sometimes I hear the infamous “How hard can it be?” catchphrase. Last week someone I was discussing with thought that the assembly language is just like a regular programming language, but in binary form—it’s easy to make that mistake if you’ve never seen what assembly is or looks like. Historically, I’ve always said that reverse engineering and ASM is “too complicated to explain” or that “If you need help to get started, then you won’t be able to finish it on your own” and various other vague responses—I often wanted to explain to others why I said things like that but I never found a way to do it. You see, when something is complex, it’s easy to say that it’s complex, but it’s much harder to explain to people why it’s complex.

I was lucky to recently stumble onto a little function while reverse engineering the Intel FSP, a function that was both simple and complex, where figuring out what it does was an interesting challenge that I can easily walk you through. This function wasn’t a difficult thing to understand, and by far, it’s not one of the hard or complex things to reverse engineer, but this one is “small and complex enough” that it’s a perfect example to explain, without writing an entire book or getting into the more complex aspects of reverse engineering. So today’s post serves as a “primer” guide to reverse engineering for all of those interested in the subject. It is a required read in order to understand the next blog posts I would be writing about the Intel FSP. Ready? Strap on your geek helmet and let’s get started!


DISCLAIMER: I might make false statements in the blog post below, some by mistake, some intentionally for the purpose of vulgarizing the explanations. For example, when I say below that there are 9 registers in X86, I know there are more (SSE, FPU, or even just the DS or EFLAGS registers, or purposefully not mentioning EAX instead of RAX, etc.), but I just don’t want to complicate matters by going too wide in my explanations.


A prelude

First things first, you need to understand some basic concepts, such as “what is ASM exactly”. I will explain some basic concepts but not all the basic concepts you might need. I will assume that you know at least what a programming language is and know how to write a simple “hello world” in at least one language, otherwise you’ll be completely lost.

So, ASM is the Assembly language, but it’s not the actual binary code that executes on the machine. It is however, very similar to it. To be more exact, the assembly language is a textual representation of the binary instructions given to the microprocessor. You see, when you compile your regular C program into an executable, the compiler will transform all your code into some very, very, very basic instructions. Those instructions are what the CPU will understand and execute. By combining a lot of small, simple and specific instructions, you can do more complex things. That’s the basis of any programming language, of course, but with assembly, the building blocks that you get are very limited. Before I’ll talk about instructions, I want to explain two concepts first which you’ll need to follow the rest of the story.

The stack

First I’ll explain what “the stack” is.  You may have heard of it before, or maybe you didn’t, but the important thing to know is that when you write code, you have two types of memory:

  • The first one is your “dynamic memory”, that’s when you call ‘malloc’ or ‘new’ to allocate new memory, this goes from your RAM upward (or left-to-right), in the sense that if you allocate 10 bytes, you’ll first get address 0x1000 for example, then when you allocate another 30 bytes, you’ll get address 0x100A, then if you allocate another 16 bytes, you’ll get 0x1028, etc.
  • The second type of memory that you have access to is the stack, which is different, instead it grows downward (or right-to-left), and it’s used to store local variables in a function. So if you start with the stack at address 0x8000, then when you enter a function with 16 bytes worth of local variables, your stack now points to address 0x7FF0, then you enter another function with 64 bytes worth of local variables, and your stack now points to address 0x7FB0, etc. The way the stack works is by “stacking” data into it, you “push” data in the stack, which puts the variable/data into the stack and moves the stack pointer down, you can’t remove an item from anywhere in the stack, you can always only remove (pop) the last item you added (pushed). A stack is actually an abstract type of data, like a list, an array, a dictionary, etc. You can read more about what a stack is on wikipedia and it shows you how you can add and remove items on a stack with this image:

The image shows you what we call a LIFO (Last-In-First-Out) and that’s what a stack is. In the case of the computer’s stack, it grows downward in the RAM (as opposed to upward in the above image) and is used to store local variables as well as the return address for your function (the instruction that comes after the call to your function in the parent function). So when you look at a stack, you will see multiple “frames”, you’ll see your current function’s stack with all its variables, then the return address of the function that called it, and above it, you’ll see the previous function’s frame with its own variables and the address of the function that called it, and above, etc. all the way to the main function which resides at the top of the stack.

Here is another image that exemplifies this:

The registers

The second thing I want you to understand is that the processor has multiple “registers”. You can think of a register as a variable, but there are only 9 total registers on x86, with only 7 of them usable. So, on the x86 processor, the various registers are: EAX, EBX, ECX, EDX, EDI, ESI, EBP, ESP, EIP.

There are two registers in there that are special:

  • The EIP (Instruction Pointer) contains the address of the current instruction being executed.
  • The ESP (Stack Pointer) contains the address of the stack.

Access to the registers is extremely fast when compared to accessing the data in the RAM (the stack also resides on the RAM, but towards the end of it) and most operations (instructions) have to happen on registers. You’ll understand more when you read below about instructions, but basically, you can’t use an instruction to say “add value A to value B and store it into address C”, you’d need to say “move value A into register EAX, then move value B into register EBX, then add register EAX to register EBX and store the result in register ECX, then store the value of register ECX into the address C”.

The instructions

Let’s go back to explaining instructions now. As I explained before, the instructions are the basic building blocks of the programs, and they are very simple, they take the form of:

INS OP1, OP2, OP3

Where “INS” is the instruction”, and OP1, OP2, OP3 is what we call the “operand”, most instructions will only take 2 operands, some will take no operands, some will take one operand and others will take 3 operands. The operands are usually registers. Sometimes, the operand can be an actual value (what we call an “immediate value”) like “1”, “2” or “3”, etc. and sometimes, the operand is a relative position from a register, like for example “[%eax + 4]” meaning the address pointed to by the %eax register + 4 bytes. We’ll see more of that shortly. For now, let’s give you the list of the most common and used instructions:

  • MOV“: move data from one operand into another
  • ADD/SUB/MUL/DIV“: Add, Substract, Multiply, Divide one operand with another and store the result in a register
  • AND/OR/XOR/NOT/NEG“: Perform logical and/or/xor/not/negate operations on the operand
  • SHL/SHR“: Shift Left/Shift Right the bits in the operand
  • CMP/TEST“: Compare one register with an operand
  • JMP/JZ/JNZ/JB/JS/etc.”: Jump to another instruction (Jump unconditionally, Jump if Zero, Jump if Not Zero, Jump if Below, Jump if Sign, etc.)
  • PUSH/POP“: Push an operand into the stack, or pop a value from the stack into a register
  • CALL“: Call a function. This is the equivalent of doing a “PUSH %EIP+4” + “JMP”. I’ll get into calling conventions later..
  • RET“: Return from a function. This is the equivalent of doing a “POP %EIP”

That’s about it, that’s what most programs are doing. Of course, there’s a lot more instructions, you can see a full list here, but you’ll see that most of the other instructions are very obscure or very specific or variations on the above instructions, so really, this represents most of the instructions you’ll ever encounter.

I want to explain one thing before we go further down: there is an additional register I didn’t mention before called the FLAGS register, which is basically just a status register that contains “flags” that indicate when some arithmetic condition happened on the last arithmetic operation. For example, if you add 1 to 0xFFFFFFFF, it will give you ‘0’ but the “Overflow flag” will be set in the FLAGS register. If you substract 5 from 0, it will give you 0xFFFFFFFB and the “Sign flag” will be set because the result is negative, and if you substract 3 from 3, the result will be zero and the “Zero flag” will be set.

I’ve shown you the “CMP” instruction which is used to compare a register with an operand, but you might be wondering, “What does it mean exactly to ‘compare’?” Well, it’s simple, the CMP instruction is the same thing as the SUB instruction, in that, it substracts one operand from another, but the difference is that it doesn’t store the result anywhere. However, it does get your flags updated in the FLAGS register. For example, if I wanted to compare %EAX register with the value ‘2’, and %EAX contains the value 3, this is what’s going to happen: you will substract 2 from the value, the result will be 1, but you don’t care about that, what you care about is that the ZF (Zero flag) is not set, and the SF (Sign flag is not set), which means that %eax and ‘2’ are not equal (otherwise, ZF would be set), and that the value in %eax is superior to 2 (because SF is not set), so you know that “%eax > 2” and that’s what the CMP does.

The TEST instruction is very similar but it does a logical AND on the two operands for testing, so it’s used for comparing logical values instead of arithmetic values (“TEST %eax, 1” can be used to check if %eax contains an odd or even number for example).

This is useful because the next bunch of instructions I explained in the list above is conditional Jump instructions, like “JZ” (jump if zero) or “JB” (jump if below), or “JS” (jump if sign), etc. This is what is used to implement “if, for, while, switch/case, etc.” it’s as simple as doing a “CMP” followed by a “JZ” or “JNZ” or “JB”, “JA”, “JS”, etc.

And if you’re wondering what’s the difference between a “Jump if below” and “Jump if sign” and “Jump if lower”, since they all mean that the comparison gave a negative result, right? Well, the “jump if below” is used for unsigned integers, while “jump if lower” is used for signed integers, while “jump if sign” can be misleading. An unsigned 3 – 4 would give us a very high positive result…  something like that, in practice, JB checks the Carry Flag, while JS checks the Sign Flag and JL checks if the Sign Flag is equal to the Overflow flag. See the Conditional Jump page for more details.

A practical example

Here’s a very small and simple practical example, if you have a simple C program like this:

int main() {
   return add_a_and_b(2, 3);
}

int add_a_and_b(int a, int b) {
   return a + b;
}

It would compile into something like this:
_main:
   push   3                ; Push the second argument '3' into the stack
   push   2                ; Push the first argument '2' into the stack
   call   _add_a_and_b     ; Call the _add_a_and_b function. This will put the address of the next
                           ; instruction (add) into the stack, then it will jump into the _add_a_and_b
                           ; function by putting the address of the first instruction in the _add_a_and_b
                           ; label (push %ebx) into the EIP register
   add    %esp, 8          ; Add 8 to the esp, which effectively pops out the two values we just pushed into it
   ret                     ; Return to the parent function.... 

_add_a_and_b:
   push   %ebx             ; We're going to modify %ebx, so we need to push it to the stack
                           ; so we can restore its value when we're done
   mov    %eax, [%esp+8]   ; Move the first argument (8 bytes above the stack pointer) into EAX
   mov    %ebx, [%esp+12]  ; Move the second argument (12 bytes above the stack pointer) into EBX
   add    %eax, %ebx       ; Add EAX and EBX and store the result into EAX
   pop    %ebx             ; Pop EBX to restore its previous value
   ret                     ; Return back into the main. This will pop the value on the stack (which was
                           ; the address of the next instruction in the main function that was pushed into
                           ; the stack when the 'call' instruction was executed) into the EIP register

Yep, something as simple as that, can be quite complicated in assembly. Well, it’s not really that complicated actually, but a couple of things can be confusing.

You have only 7 usable registers, and one stack. Every function gets its arguments passed through the stack, and can return its return value through the %eax register. If every function modified every register, then your code will break, so every function has to ensure that the other registers are unmodified when it returns (other than %eax). You pass the arguments on the stack and your return value through %eax, so what should you do if need to use a register in your function? Easy: you keep a copy on the stack of any registers you’re going to modify so you can restore them at the end of your function. In the _add_a_and_b function, I did that for the %ebx register as you can see. For more complex function, it can get a lot more complicated than that, but let’s not get into that for now (for the curious: compilers will create what we call a “prologue” and an “epilogue” in each function. In the prologue, you store the registers you’re going to modify, set up the %ebp (base pointer) register to point to the base of the stack when your function was entered, which allows you to access things without keeping track of the pushes/pops you do throughout the function, then in the epilogue, you pop the registers back, restore %esp to the value that was saved in %ebp, before you return).

The second thing you might be wondering about is with these lines:

mov %eax, [%esp+8]
mov %ebx, [%esp+12]

And to explain it, I will simply show you this drawing of the stack’s contents when we call those two instructions above:

For the purposes of this exercise, we’re going to assume that the _main function is located in memory at the address 0xFFFF0000, and that each instructoin is 4 bytes long (the size of each instruction can vary depending on the instruction and on its operands). So you can see, we first pushed 3 into the stack, %esp was lowered, then we pushed 2 into the stack, %esp was lowered, then we did a ‘call _add_a_and_b’, which stored the address of the next instruction (4 instructions into the main, so ‘_main+16’) into the stack and esp was lowered, then we pushed %ebx, which I assumed here contained a value of 0, and the %esp was lowered again. If we now wanted to access the first argument to the function (2), we need to access %esp+8, which will let us skip the saved %ebx and the ‘Return address’ that are in the stack (since we’re working with 32 bits, each value is 4 bytes). And in order to access the second argument (3), we need to access %esp+12.

Binary or assembly?

One question that may (or may not) be popping into your mind now is “wait, isn’t this supposed to be the ‘computer language’, so why isn’t this binary?” Well, it is… in a way. As I explained earlier, “the assembly language is a textual representation of the binary instructions given to the microprocessor”, what it means is that those instructions are given to the processor as is, there is no transformation of the instructions or operands or anything like that. However, the instructions are given to the microprocessor in binary form, and the text you see above is just the textual representation of it.. kind of like how “68 65 6c 6c 6f” is the hexadecimal representation of the ASCII text “hello”. What this means is that each instruction in assembly language, which we call a ‘mnemonic’ represents a binary instruction, which we call an ‘opcode’, and you can see the opcodes and mnemonics in the list of x86 instructions I gave you above. Let’s take the CALL instruction for example. The opcode/mnemonic list is shown as:

Opcode Mnemonic Description
E8 cw CALL rel16 Call near, relative, displacement relative to next instruction
E8 cd CALL rel32 Call near, relative, displacement relative to next instruction
FF /2 CALL r/m16 Call near, absolute indirect, address given in r/m16
FF /2 CALL r/m32 Call near, absolute indirect, address given in r/m32
9A cd CALL ptr16:16 Call far, absolute, address given in operand
9A cp CALL ptr16:32 Call far, absolute, address given in operand
FF /3 CALL m16:16 Call far, absolute indirect, address given in m16:16
FF /3 CALL m16:32 Call far, absolute indirect, address given in m16:32

This means that this same “CALL” mnemonic can have multiple addresses to call. Actually, there are four different possitiblities, each having a 16 bits and a 32 bits variant. The first possibility is to call a function with a relative displacement (Call the function 100 bytes below this current position), or an absolute address given in a register (Call the function whose address is stored in %eax) or an absolute address given as a pointer (Call the function at address 0xFFFF0100), or an absolute address given as an offset to a segment (I won’t explain segments now). In our example above, the “call _add_a_and_b” was probably stored as a call relative to the current position with 12 bytes below the current instruction (4 bytes per instruction, and we have the CALL, ADD, RET instructions to skip). This means that the instruction in the binary file was encoded as “E8 00 00 00 0C” (The E8 opcode to mean a “CALL near, relative”, and the “00 00 00 0C” to mean 12 bytes relative to the current instruction). Now, the most observant of you have probably noticed that this CALL instruction takes 5 bytes total, not 4, but as I said above, we will assume it’s 4 bytes per instruction just for the sake of keeping things simple, but yes, the CALL (in this case) is 5 bytes, and other instructions will sometimes have more or less bytes as well.

I chose the CALL function above for example, because I think it’s the least complicated to explain.. other instructions have even more complicated opcodes and operands (See the ADD and ADC (Add with Cary) instructions for example, you’ll notice the same opcodes shared between them even, so they are the same instruction, but it’s easy to give them separate mnemonics to differentiate their behaviors).

Here’s a screenshot showing a side by side view of the Assembly of a function with the hexadecimal view of the binary:

As you can see, I have my cursor on address 0xFFF6E1D6 on the assembly view on the left, which is also highlighted on the hex view on the right. That address is a CALL instruction, and you can see the equivalent hex of “E8 B4 00 00 00”, which means it’s a CALL near, relative (E8 being the opcode for it) and the function is 0xB4 (180) bytes below our current position of 0xFFF6E1D6.

If you open the file with a hexadecimal editor, you’ll only see the hex view on the right, but you need to put the file into a Disassembler (such as the IDA disassembler which I’m using here, but there are cheaper alternatives as well, the list can be long), and the disassembler will interpret those binary opcodes to show you the textual assembly representation which is much much easier to read.

Some actual reverse engineering

Now that you have the basics, let’s do a quick reverse engineering exercise… This is a very simple function that I’ve reversed recently, it comes from the SiliconInit part of the FSP, and it’s used to validated the UPD configuration structure (used to tell it what to do).

Here is the Assembly code for that function:

This was disassembled using IDA 7.0 (The Interactive DisAssembler) which is an incredible (but expensive) piece of software. There are other disassemblers which can do similar jobs, but I prefer IDA personally. Let’s first explain what you see on the screen.

On the left side, you see “seg000:FFF40xxx” this means that we are in the segment “seg000” at the address 0xFFF40xxx. I won’t explain what a segment is, because you don’t need to know it. The validate_upd_config function starts at address 0xFFF40311 in the RAM, and there’s not much else to understand. You can see how the address increases from one instruction to the next, it can help you calculate the size in bytes that each instruction takes in RAM for example, if you’re curious of course… (the XOR is 2 bytes, the CMP is 2 bytes, etc.).

As you’ve seen in my previous example, anything after a semicolon (“;”) is considered a comment and can be ignored. The “CODE XREF” comments are added by IDA to tell us that this code has a cross-references (is being called by) some other code. So when you see “CODE XREF: validate_upd_config+9” (at 0xFF40363, the RETN instruction), it means this instruction is being called (referenced by) from the function validate_upd_config and the “+9” means 9 bytes into the function (so since the function starts at 0xFFF40311, it means it’s being called from the instruction at offset 0xFFF4031A. The little “up” arrow next to it means that it comes from above the current position in the code, and if you follow the grey lines on the left side of the screen, you can follow that call up to the address 0xFFF4031A which contains the instruction “jnz short locret_FFF40363”. I assume the “j” letter right after the up arrow is to tell us that the reference comes from a “jump” instruction.

As you can see in the left side of the screen, there are a lot of arrows, that means that there’s a lot of jumping around in the code, even though it’s not immediatly obvious. The awesome IDA software has a “layout view” which gives us a much nicer view of the code, and it looks like this:

Now you can see each block of code separately in their own little boxes, with arrows linking all of the boxes together whenever a jump happens. The green arrows mean that it’s a conditional jump when the condition is successful, while the red arrows means the condition was not successful. This means that a “JZ” will show a green arrow towards the code it would jump to if the result is indeed zero, and a red arrow towards the block where the result is not zero. A blue arrow means that it’s an unconditional jump.

I usually always do my reverse engineering using the layout view, I find it much easier to read/follow, but for the purpose of this exercise, I will use the regular linear view instead, so I think it will be easier for you to follow with that instead. The reason is mostly because the layout view doesn’t display the address of each instruction, and it’s easier to have you follow along if I can point out exactly which instruction I’m looking it by mentioning its address.

Now that you know how to read the assembly code, you understand the various instructions, I feel you should be ready to reverse engineering this very simple assembly code (even though it might seem complex at first). I just need to give you the following hints first:

  • Because I’ve already reversed engineering it, you get the beautiful name “validate_upd_config” for the function, but technically, it was simply called “sub_FFF40311”
  • I had already reverse engineered the function that called it so I know that this function is receiving its arguments in an unusual way. The arguments aren’t pushed to the stack, instead, the first argument is stored in %ecx, and the second argument is stored in %edx
  • The first argument (%ecx, remember?) is an enum to indicate what type of UPD structure to validate, let me help you out and say that type ‘3’ is the FSPM_UPD (The configuration structure for the FSPM, the MemoryInit function), and that type ‘5’ is the FSPS_UPD (The configuration structure for the FSPS, the SiliconInit function).
  • Reverse engineering is really about reading one line at a time, in a sequential manner, keep track of which blocks you reversed and be patient. You can’t look at it and expect to understand the function by viewing the big picture.
  • It is very very useful in this case to have a dual monitor, so you can have one monitor for the assembly, and the other monitor for your C code editor. In my case, I actually recently bought an ultra-wide monitor and I split screen between my IDA window and my emacs window and it’s great. It’s hard otherwise to keep going back and forth between the assembly and the C code. That being said, I would suggest you do the same thing here and have a window on the side showing you the assembly image above (not the layout view) while you read the explanation on how to reverse engineer it below.

Got it? All done? No? Stop sweating and hyperventilating… I’ll explain exactly how to reverse engineer this function in the next paragraph, and you will see how simple it turns out to be!

Let’s get started!

The first thing I do is write the function in C. Since I know the name and its arguments already, I’ll do that:

void validate_upd_config (uint8_t action, void *config) {
}

Yeah, there’s not much to it yet, and I set it to return “void” because I don’t know if it returns anything else, and I gave the first argument “action” as a “uint8_t” because in the parent function it’s used a single byte register (I won’t explain for now how to differentiate 1-byte, 2-bytes, 4-bytes and 8-bytes registers). The second argument is a pointer, but I don’t know it’s a pointer to what kind of structure exactly, so I just set it as a void *.

The first instruction is a “xor eax, eax”. What does this do? It XORs the eax register with the eax register and stores the result in the eax register itself, which is the same thing as “mov eax, 0”, because 1 XOR 1= 0 and 0 XOR 0 = 0, so if every bit in the eax register is logically XORed with itself, it will give 0 for the result. If you’re asking yourself “Why did the compiler decide to do ‘xor eax, eax’ instead of ‘mov eax, 0’ ?” then the answer is simple: “Because it takes less CPU clock cycles to do a XOR, than to do a move”, which means it’s more optimized and it will run faster. Besides, the XOR takes 2 bytes as you can see above (the address of the instructions jumped from FFF40311 to FFF40313), while a “mov eax, 0” would have taken 5 bytes. So it also helps keep the code smaller.

Alright, so now we know that eax is equal to 0, let’s keep that in mind and move along (I like to keep track of things like that as comments in my C code). Next instruction does a “cmp ecx, 3”, so it’s comparing ecx, which we already know is our first argument (uint8_t action), ok, it’s a comparison, not much to do here, again let’s keep that in mind and continue… the next instruction does a “jnz short loc_FFF40344”, which is more interesting, so if the previous comparison is NOT ZERO, then jump to the label loc_FFF40344 (for now ignore the “short”, it just helps us differentiate between the various mnemonics, and it means that the jump is a relative offset that fits in a “short word” which means 2 bytes, and you can confirm that the jnz instruction does indeed take only 2 bytes of code). Great, so there’s a jump if the result is NOT ZERO, which means that if the result is zero, the code will just continue, which means if the ecx register (action variable) is EQUAL (substraction is zero) to 3, the code will continue down to the next instruction instead of jumping… let’s do that, and in the meantime we’ll update our C code:

void validate_upd_config (uint8_t action, void *config) {
   // eax = 0
   if (action == 3) {
      // 0xFFF40318 
   } else {
      // loc_FFF40344
   }
}

The next instruction is “test edx, edx”.  We know that the edx register is our second argument which is the pointer to the configuration structure. As I explained above, the “test” is just like a comparison, but it does an AND instead of a substraction, so basically, you AND edx with itself.. well, of course, that has no consequence, 1 AND 1 = 1, and 0 AND 0 = 0, so why is it useful to test a register against itself? Simply because the TEST will update our FLAGS register… so when the next instruction is “JZ” it basically means “Jump if the edx register was zero”… And yes, doing a “TEST edx, edx”  is more optimized than doing a “CMP edx, 0”, you’re starting to catch on, yeay!

And indeed, the next instruction is “jz locret_FFF40363”, so if the edx register is ZERO, then jump to locret_FFF40363, and if we look at that locret_FFF40363, it’s a very simple “retn” instruction. So our code becomes:

void validate_upd_config (uint8_t action, void *config) {
  // eax = 0
  if (action == 3) {
    if (config == NULL)
       return; 
  } else {
    // loc_FFF40344
  }
}

Next! Now it gets slightly more complicated… the instruction is: “cmp dword ptr [edx], 554C424Bh”, which means we do a comparison of a dword (4 bytes), of the data pointed to by the pointer edx, with no offset (“[edx]” is the same as saying “edx[0]” if it was a C array for example), and we compare it to the value 554C424Bh… the “h” at the end means it’s a hexadecimal value, and with experience you can quickly notice that the hexadecimal value is all within the ASCII range, so using a Hex to ASCII converter, we realize that those 4 bytes represent the ASCII letters “KBLU” (which is why I manually added them as a comment to that instruction, so I won’t forget). So basically the instruction compares the first 4 bytes of the structure (the content pointed to by the edx pointer) to the string “KBLU”. The next instruction does a “jnz loc_FFF4035E” which means that if the comparison result is NOT ZERO (so, if they are not equal) we jump to loc_FFF4035E.

Instead of continuing sequentially, I will see what that loc_FFF4035E contains (of course, I did the same thing in all the previous jumps, and had to decide if I wanted to continue reverse engineering the jump or the next instruction, in this case, it seems better for me to jump, you’ll see why soon). The loc_FFF4035E label contains the following instruction: “mov, eax, 80000002h”, which means it stores the value 0x80000002 into the eax register, and then it jumps to (not really, it just naturally flows to the next instruction which happens to be the label) locret_FFF40363, which is just a “retn”. This makes our code into this:

uint32_t validate_upd_config (uint8_t action, void *config) {
  // eax = 0
  if (action == 3) {
    if (config == NULL)
       return 0; 
    if (((uint32_t *)config)[0] != 0x554C524B)
       return 0x80000002;
  } else {
    // loc_FFF40344
  }
}

The observant here will notice that I’ve changed the function prototype to return a uint32_t instead of “void” and my previous “return” has become “return 0” and the new code has a “return 0x80000002”. That’s because I realized at this point that the “eax” register is used to return a uint32_t value. And since the first instruction was “xor eax, eax”, and we kept in the back of our mind that “eax is initialized to 0”, it means that the use case with the (config == NULL) will return 0. That’s why I made all these changes…

Very well, let’s go back to where we were, since we’ve exhausted this jump, we’ll jump back in reverse to go back to the address FFF40322 and continue from there to the next instruction. It’s a “cmp dword ptr [edx+4], 4D5F4450h”, which compares the dword at edx+4 to 0x4D5F4450, which I know to be the ASCII for “PD_M”; this means that the last 3 instructions are used to compare the first 8 bytes of our pointer to “KBLUPD_M”… ohhh, light bulb above our heads, it’s comparing the pointer to the Signature of the FSPM_UPD structure (don’t forget, you weren’t supposed to know that the function is called validate_upd_config, or that the argument is a config pointer… just that it’s a pointer)! OK, now it makes sense, and while we’re at it—and since we are, of course, reading the FSP integration guide PDF, we then also realize what the 0x80000002 actually means. At this point, our code now becomes:

EFI_STATUS validate_upd_config (uint8_t action, void *config) {
  if (action == 3) {
    FSPM_UPD *upd = (FSPM_UPD *) config;
    if (upd == NULL)
       return EFI_SUCCESS; 
    if (upd->FspUpdHeader.Signature != 0x4D5F4450554C524B /* 'KBLUPD_M'*/)
       return EFI_INVALID_PARAMETERS;
  } else {
    // loc_FFF40344
  }
}

Yay, this is starting to look like something… Now you probably got the hang of it, so let’s do things a little faster now.

  • The next line “cmp [edx+28h], eax” compares edx+0x28 to eax. Thankfully, we know now that edx points to the FSPM_UPD structure, and we can calculate that at offset 0x28 inside that structure, it’s the field StackBase within the FspmArchUpd field…
  • and also, we still have in the back of our minds that ‘eax’ is initialized to zero, so, we know that the next 2 instructions are just checking if upd->FspmArchUpd.StackBase is == NULL.
  • Then we compare the StackSize with 0x26000, but the comparison is using “jb” for the jump, which is “jump if below”, so it checks if StackSize < 0x26000,
  • finally it does a “test” with “edx+30h” (which is the BootloaderTolumSize field) and 0xFFF, then it does an unconditional jump to loc_FFF4035C, which itself does a “jz” to the return..
  • which means if (BootloaderTolumSize  & 0xFFF  == 0) it will return whatever EAX contained (which is zero),
  • but if it doesn’t, then it will continue to the next instruction which is the “mov eax, 80000002h”.

So, we end up with this code:

EFI_STATUS validate_upd_config (uint8_t action, void *config) {
  // eax = 0
  if (action == 3) {
    FSPM_UPD *upd = (FSPM_UPD *) config;
    if (upd == NULL)
       return 0;
    if (upd-&gt;FspUpdHeader.Signature != 0x4D5F4450554C524B /* 'KBLUPD_M'*/)
       return EFI_INVALID_PARAMETERS;
    if (upd-&gt;FspmArchUpd.StackBase == NULL)
        return EFI_INVALID_PARAMETERS;
    if (upd-&gt;FspmArchUpd.StackSize &lt; 0x2600)
        return EFI_INVALID_PARAMETERS;
    if (upd-&gt;FspmArchUpd.BootloaderTolumSize &amp; 0xFFF)
        return EFI_INVALID_PARAMETERS;
  } else {
    // loc_FFF40344
  }
  return EFI_SUCCESS
}

Great, we just solved half of our code! Don’t forget, we jumped one way instead of another at the start of the function, now we need to go back up and explore the second branch of the code (at offset 0xFFF40344). The code is very similar, but it checks for “KBLUPD_S” Signature, and nothing else. Now we can also remove any comment/notes we have (such as the note that eax is initialized to 0) and clean up, and simplify the code if there is a need.

So our function ends up being (this is the final version of the function):

EFI_STATUS validate_upd_config (uint8_t action, void *config) {
  if (action == 3) {
    FSPM_UPD *upd = (FSPM_UPD *) config;
    if (upd == NULL)
       return EFI_SUCCESS;
    if (upd-&gt;FspUpdHeader.Signature != 0x4D5F4450554C524B /* 'KBLUPD_M'*/)
       return EFI_INVALID_PARAMETERS;
    if (upd-&gt;FspmArchUpd.StackBase == NULL)
        return EFI_INVALID_PARAMETERS;
    if (upd-&gt;FspmArchUpd.StackSize &lt; 0x2600)
        return EFI_INVALID_PARAMETERS;
    if (upd-&gt;FspmArchUpd.BootloaderTolumSize &amp; 0xFFF)
        return EFI_INVALID_PARAMETERS;
  } else {
    FSPS_UPD *upd = (FSPS_UPD *) config;
    if (upd == NULL)
        return EFI_SUCCESS;
    if (upd-&gt;FspUpdHeader.Signature != 0x535F4450554C524B /* 'KBLUPD_S'*/)
        return EFI_INVALID_PARAMETERS;
  }
  return EFI_SUCCESS
}

Now this wasn’t so bad, was it? I mean, it’s time consuming, sure, it can be a little disorienting if you’re not used to it, and you have to keep track of which branches (which blocks in the layout view) you’ve already gone through, etc. but the function turned out to be quite small and simple. After all, it was mostly only doing CMP/TEST and JZ/JNZ.

That’s pretty much all I do when I do my reverse engineering, I go line by line, I understand what it does, I try to figure out how it fits into the bigger picture, I write equivalent C code to keep track of what I’m doing and to be able to understand what happens, so that I can later figure out what the function does exactly… Now try to imagine doing that for hundreds of functions, some of them that look like this (random function taken from the FSPM module):

You can see on the right, the graph overview which shows the entirety of the function layout diagram. The part on the left (the assembly) is represented by the dotted square on the graph overview (near the middle). You will notice some arrows that are thicker than the others, that’s used in IDA to represent loops. On the left side, you can notice one such thick green line coming from the bottom and the arrow pointing to a block inside our view. This means that there’s a jump condition below that can jump back to a block that is above the current block and this is basically how you do a for/while loop with assembly, it’s just a normal jump that points backwards instead of forwards.

Finally, the challenge!

At the beginning of this post, I mentioned a challenging function to reverse engineer. It’s not extremely challenging—it’s complex enough that you can understand the kind of things I have to deal with sometimes, but it’s simple enough that anyone who was able to follow up until now should be able to understand it (and maybe even be able to reverse engineer it on their own).

So, without further ado, here’s this very simple function:

Since I’m a very nice person, I renamed the function so you won’t know what it does, and I removed my comments so it’s as virgin as it was when I first saw it. Try to reverse engineer it. Take your time, I’ll wait:

Alright, so, the first instruction is a “call $+5”, what does that even mean?

  1. When I looked at the hex dump, the instruction was simply “E8 00 00 00 00” which according to our previous CALL opcode table means “Call near, relative, displacement relative to next instruction”, so it wants to call the instruction 0 bytes from the next instruction. Since the call opcode itself is taking 5 bytes, that means it’s doing a call to its own function but skipping the call itself, so it’s basically jumping to the “pop eax”, right? Yes…  but it’s not actually jumping to it, it’s “calling it”, which means that it just pushed into the stack the return address of the function… which means that our stack contains the address 0xFFF40244 and our next instruction to be executed is the one at the address 0xFFF40244. That’s because, if you remember, when we do a “ret”, it will pop the return address from the stack into the EIP (instruction pointer) register, that’s how it knows where to go back when the function finishes.
  2. So, then the instruction does a “pop eax” which will pop that return address into EAX, thus removing it from the stack and making the call above into a regular jump (since there is no return address in the stack anymore).
  3. Then it does a “sub eax, 0FFF40244h”, which means it’s substracting 0xFFF40244 from eax (which should contain 0xFFF40244), so eax now contains the value “0”, right? You bet!
  4. Then it adds to eax, the value “0xFFF4023F”, which is the address of our function itself. So, eax now contains the value 0xFFF4023F.
  5. It will then substract from EAX, the value pointed to by [eax-15], which means the dword (4 bytes) value at the offset 0xFFF4023F – 0xF, so the value at 0xFFF40230, right… that value is 0x1AB (yep, I know, you didn’t have this information)… so, 0xFFF4023F – 0x1AB = 0xFFF40094!
  6. And then the function returns.. with the value 0xFFF40094 in EAX, so it returns 0xFFF40094, which happens to be the pointer to the FSP_INFO_HEADER structure in the binary.

So, the function just returns 0xFFF40094, but why did it do it in such a convoluted way? The reason is simple: because the FSP-S code is technically meant to be loaded in RAM at the address 0xFFF40000, but it can actually reside anywhere in the RAM when it gets executed. Coreboot for example doesn’t load it in the right memory address when it executes it, so instead of returning the wrong address for the structure and crashing (remember, most of the jumps and calls use relative addresses, so the code should work regardless of where you put it in memory, but in this case returning the wrong address for a structure in memory wouldn’t work), the code tries to dynamically verify if it has been relocated and if it is, it will calculate how far away it is from where it’s supposed to be, and calculate where in memory the FSP_INFO_HEADER structure ended up being.

Here’s the explanation why:

  • If the FSP was loaded into a different memory address, then the “call $+5” would put the exact memory address of the next instruction into the stack, so when you pop it into eax then substract from it the expected address 0xFFF40244, this means that eax will contain the offset from where it was supposed to be.
  • Above, we said eax would be equal to zero, yes, that’s true, but only in the usecase where the FSP is in the right memory address, as expected, otherwise, eax would simply contain the offset. Then you add to it 0xFFFF4023F which is the address of our function, and with the offset, that means eax now contains the exact memory address of the current function, wherever it was actually placed in RAM!
  • Then when it grabs the value 0x1AB (because that value is stored in RAM 15 bytes before the start of the function, that will work just fine) and substracts it from our current position, it gives us the address in RAM of the FSP_INFO_HEADER (because the compiler knows that the structure is located exactly 0x1AB bytes before the current function). This just makes everything be relative.

Isn’t that great!? 😉 It’s so simple, but it does require some thinking to figure out what it does and some thinking to understand why it does it that way… but then you end up with the problem of “How do I write this in C”? Honestly, I don’t know how, I just wrote this in my C file:

// Use Position-independent code to make this relocatable
void *get_fsp_info_header() {
    return 0xFFF40094; 
}

I think the compiler takes care of doing all that magic on its own when you use the -fPIC compiler option (for gcc), which means “Position-Independent Code”.

What this means for Purism

On my side, I’ve finished reverse engineering the FSP-S entry code—from the entry point (FspSiliconInit) all the way to the end of the function and all the subfunctions that it calls.

This only represents 9 functions however, and about 115 lines of C code; I haven’t yet fully figured out where exactly it’s going in order to execute the rest of the code. What happens is that the last function it calls (it actually jumps into it) grabs a variable from some area in memory, and within that variable, it will copy a value into the ESP, thus replacing our stack pointer, and then it does a “RETN”… which means that it’s not actually returning to the function that called it (coreboot), it’s returning… “somewhere”, depending on what the new stack contains, but I don’t know where (or how) this new stack is created, so I need to track it down in order to find what the return address is, find where the “retn” is returning us into, so I can unlock plenty of new functions and continue reverse engineering this.

I’ve already made some progress on that front (I know where the new stack tells us to return into) but you will have to wait until my next blog post before I can explain it all to you. It’s long and complicated enough that it needs its own post, and this one is long enough already.

Other stories from strange lands

You never really know what to expect when you start reverse engineering assembly. Here are some other stories from my past experiences.

  • I once spent a few days reverse engineering a function until about 30% of it when I finally realized that the function was… the C++ “+ operator” of the std::string class (which by the way, with the use of C++ templates made it excruciatingly hard to understand)!
  • I once had to reverse engineer over 5000 lines of assembly code that all resolved into… 7 lines of C code. The code was for creating a hash and it was doing a lot of manipulation on data with different values on every iteration. There was a LOT of xor, or, and, shifting left and right of data, etc., which took maybe a hundred or so lines of assembly and it was all inside a loop, which the compiler decided that—to optimize it—it would unravel the loop (this means that instead of doing a jmp, it will just copy-paste the same code again), so instead of having to reverse engineer the code once and then see that it’s a loop that runs 64 times, I had to reverse engineer the same code 64 times because it was basically getting copy-pasted by the compiler in a single block but the compiler was “nice” enough that it was using completely different registers for every repetition of the loop, and the data was getting shifted in a weird way and using different constants and different variables at every iteration, and—as if that wasn’t enough— every 1/4th of the loop, changing the algorithm and making it very difficult to predict the pattern, forcing me to completely reverse engineer the 5000+ assembly lines into C, then slowly refactor and optimize the C code until it became that loop with 7 lines of code inside it… If you’re curious you can see the code here at line 39, where there is some operation common to all iterations, then 4 different operations depending on which iteration we are doing, and the variables used for each operation changes after each iteration (P, PP, PPP and PPPP get swapped every time), and the constant values and the indices used are different for each iteration as well (see constants.h). It was complicated and took a long while to reverse engineer.
  • Below is the calling graph of the PS3 firmware I worked on some years ago. All of these functions have been entirely reverse engineered (each black rectangle is actually an entire function, and the arrows show which function calls which other function), and the result was the ps3xport tool. As you can see, sometimes a function can be challenging to reverse, and sometimes a single function can call so many nested functions that it can get pretty complicated to keep track of what is doing what and how everything fits together. That function at the top of the graph was probably very simple, but it brought with it so much complexity because of a single “call”:

Perseverance prevails

In conclusion:

  • Reverse engineering isn’t just about learning a new language, it’s a very different experience from “learning Java/Python/Rust after you’ve mastered C”, because of the way it works; it can sometimes be very easy and boring, sometimes it will be very challenging for a very simple piece of code.
  • It’s all about perseverance, being very careful (it’s easy to get lost or make a mistake, and very hard to track down and fix a mistake/typo if you make one), and being very patient. We’re talking days, weeks, months. That’s why reverse engineering is something that very few people do (compared to the number of people who do general software development). Remember also that our first example was 82 bytes of code, and the second one was only 19 bytes long, and most of the time, when you need to reverse engineer something, it’s many hundreds of KBs of code.

All that being said, the satisfaction you get when you finish reverse engineering some piece of code, when you finally understand how it works and can reproduce its functionality with open source software of your own, cannot be described with words. The feeling of achievement that you get makes all the efforts worth it!

I hope this write-up helps everyone get a fresh perspective on what it means to “reverse engineer the code”, why it takes so long, and why it’s rare to find someone with the skills, experience and patience to do this kind of stuff for months—as it can be frustrating, and we sometimes need to take a break from it and do something else in order to renew our brain cells.

Posted in Development, Educational | Comments Off on Introduction to reverse engineering and Assembly.

Ps3xport released!

Update : The method I spoke about below for getting your IDPs from an OFW console was released by flatz, look for his idpstealer release.  I have also fixed a major bug in ps3xport’s ExtractFile which corrupted data and ported ps3xport for windows. You can find the latest version (0.2 at the moment) on the github repository and I’ve built windows binaries here. Enjoy!

 

Hello everyone!

It’s been quite a long time and I’m very happy about that :p
Let’s do the boring part first! This is my final release for the scene, I am not “coming back” or anything like that, so don’t get your hopes up, but I needed to release this so I’d be officially done. I have never actually announced that I’m leaving the scene but everyone figured it out. It wasn’t originally done intentionally actually, but life caught up with me, work, family, lack of time, etc.. so I had little time to work on the ps3. Also, my motivation was mostly gone due to not finding anything interesting anymore, a lot of drama and I’m not a huge fan of all the attention this all brings. I got into the scene because I was curious and I wanted to learn, and I have to say I’ve learned a lot of things these past years and it was an incredible journey, but as I had lack of time and started breathing, I realized that I’ve had enough of it so I left and I am very happy with that decision because you have absolutely no idea how much of a time drain and headache this was :p
Anyways, there was one thing I did just before I left, but I never got to release it, but today is your lucky day as it’s release O’clock where I am!. This release is a way to say Merry Christmas, Happy Holidays, etc.. to everyone, and a way for me to also say “I’m done for good, I don’t have anything left for you in a drawer somewhere” :). I’ve wanted to release this for a while now, and I even made a poll on ps3hax back in March 2012 asking people if I should (looks like ps3hax is down right now so here’s the google cache version) and the general response was not to release it until it can be useful (when an npdrm workaround is found) with some people saying to release if nothing new happens in the scene.. and I think I’ve waited long enough now to know nothing new on that front will happen.

So.. since I’ve announced the release, I’ve seen a lot of speculation about what it is and what it could be.. a lot of people seem to think (or mostly, want/hope for) a downgrade method, unfortunately that’s not the case. I’ve seen some ridiculous suggestions too, like someone asking if it’s a way to run PS4 and Xbox One games on PS3.. I’m sorry to say, that’s not it either :p As I’ve said in a tweet shortly after, this is nothing groundbreaking, this is code that hasn’t been touched in 3 years, so it’s already 3 years old, but I think it’s still something that can be very useful to the community.

So here it is, I’m introducing to you : PS3xport! I’ve uploaded it to my github account here : https://github.com/kakaroto/ps3xport

What does it do? Well, it’s basically a tool for manipulating the PS3 backup data. When I say “PS3 backup”, I’m not talking about a “backup” of a game, no.. I’m talking about the full PS3 hard drive backup that you can do by going to “System Settings->Backup Utility” on your XMB. That creates an encrypted directory on your FAT32 hard drive which allows you to format your PS3 and then Restore it just like it was before. I’ve reverse engineered the file format and encryption and PS3xport allows you to create new backup data from scratch, or dump existing ones, or delete specific files from a backup or do a whole lot of other things to your backup folders. This gives you total control over your /dev_hdd0 and /dev_flash2 filesystems, which will let you install homebrew on any console, even if it’s the latest OFW version. Unfortunately, just like it was 3 years ago, you wouldn’t be able to run those homebrew apps you install due to the NPDRM ECDSA signature missing. If you have your IDPS though for example, it could let you restore a backup from one PS3 to another PS3 without losing any of your data in the transfer.

So.. what’s this about “your IDPS”? yes, the backup has two sets of files, some can be decrypted right away and some can’t because they are encrypted with your IDPS (your unique ps3 device id) which is why they can’t be restored on a different ps3. If you have a CFW, you can easily get your IDPS (I’ve written a small tool to do that, released on github, but apparently MM and Webman will also give you that information) and that will give you total control over your backup data as you would be able to decrypt and reencrypt it. If you have OFW and can’t get your IDPS, then you will not be able to dump/decode all the files from your backup, but you will still be able to create a backup that can be restored on your PS3 with no limitations (this means for example that you can restore a backup from a CFW into an OFW without any issues). I was told however that someone can get IDPS from OFW consoles and in light of this release, they might release their method soon, I can’t say more than that though, but be patient and good things come to those who wait 🙂

So my release is in two parts. First, the documentation of the file format was added to the ps3devwiki so any developer can understand how the backup archive files are created and can create their own tools. Reverse engineering that format took months of work and I won’t go into too much details about what had to be done to figure out the format but it was an incredibly long and difficult task to do that I had a lot of fun in doing. The second part of the release is of course the release of the ps3xport tool. The tool is quite powerful and you can do a lot of things with it, but it’s a command line only tool and I honestly just tested it on Linux, it’s not really my job at this point to make a windows build, or make a GUI around it, etc.. but I’m sure it won’t be long before others in the scene pick it up and make a nice GUI for it and release windows binaries. I’ve written a nice README file so everyone can understand how the tool works and what it can do. I remember though that 3 years ago just before I stopped working on it, I wanted to add a “AddPKG” command to it which would just ‘install’ a pkg into the backup data automatically, unfortunately, I never got to do it, but it should be easy to do. While I’m at it, I’m also releasing a pkg extraction tool which I found in an old directory (cool thing is the -p option in it, try it…) as well which is a PKG extraction tool that uses the PagedFile mechanism (see below) to allow for very fast pkg file access with very little memory usage even for huge pkg files, any dev can probably mix those two together to add the AddPKG feature to ps3xport.

On the software front, ps3xport.c will parse the commands then use the archive_* API which is in archive.c. That will contain all the functions needed to manipulate the archive files. It uses a ChainedList which is my rudimentary implementation of a GList-like ordered list and the archive API also uses PagedFile objects which are pretty cool. PagedFiles are a wrapper around a file which allows you to read/write to a file using pages (I set it to 64KB per page I think) so it limits the hard drive access. The cool thing about it is that it has encryption and hashing built in, so you can just set the encryption key or ask for the file to be hashed, and whenever you read/write, the encryption will be done transparently, and the coolest thing about it is that you can actually seek in the encrypted file and it will still work (it recalculates the required IV whenever you seek). The encryption there works on the stream, so you don’t need to write blocks of 16 bytes every time (thanks to the paging of the data) and it has a cool ‘splice’ method which allows you to copy data from one PagedFile to another easily, so you could in theory re-encrypt a file using a different key using 5 function calls (open *2, set_key*2, splice).

That’s about it.

I’m really happy about this release, and I want to say Merry Christmas/Happy New Year to everyone, and of course..

So long, and thanks for all the fish!

 

Posted in Development, PS3 | 33 Comments

Introduction to Cryptocurrencies

Tips :

BTC : 15fXq5FzKaUArzQ8zBWfJADMn1qTQ5w5Y6
DOGE : DK4uNKX99VTgEv8BZdeepsYJHKhev3oR1j

Hello everyone!

Today, I received an email from a friend saying that he knows that I’m into crypto currencies recently and he wanted to know if I could give him some pointers… in true KaKaRoTo fashion, I wrote him a long email to explain everything there is to know about cryptocurrencies. I think there’s a lot of interesting stuff in there that others might find useful so I’ve decided to make it into a blog post.

Note that I didn’t edit the email, so read it as an “email to a friend”.

Concerning crypto currencies, it’s a whole world and really quite interesting. I’ll try and give you as much info as i can. Note though that everything I say below may not be 100% accurate, as I might say something wrong, either because I misunderstood it myself or maybe to keep things simplified.
I don’t know how much you know about it, so at the risk of saying things you know, I’ll just assume you know nothing about cryptocurrencies.
So cryptocurrencies started with bitcoin in 2010 i think, satoshi nakamoto (an anonymous name of person or a group) released a white paper explaining the concept and created the currency, wallet, website, etc.. Of course you can check bitcoin website for more info, but the concept is basically that btc have a value, which is defined by how much people want to pay for it, but as is, it’s just a number. The system works by constantly generating coins until a maximum is reached, the concept follows gold mining where as long as you mine, you get more of it but there is a limit on the resource in the world and the value of gold depends on how much people want to pay for it, but other than that, it has no real value (it’s just a metal, right?).
I’ll explain about btc (bitcoin) then expand on the other cryptocurrencies (that we call altcoins).
Btc has a blockchain which is a public ledger which is made up of “blocks”, each containing transactions, in order to create an account you just generate a private and public key, the public key is your “account number” and the private key is your wallet. To send BTC from one person to another, you create a transaction containing your origin and destination accounts and the amount then sign it with your key then post it on the network which will add it to the current block, other nodes in the network will then check that the transaction was signed with the origin account’s private key, and they will sign your transaction in turn in order to confirm that it’s been verified.
At the same time, you have “miners” which are trying to generate the next block, more on that in a second. When a new block is generated, it gets added to the blockchain (ledger) and all new transactions get written into this new block. The previous one gets finalized. To ensure security of the network, new blocks are constantly being generated, for bitcoin it’s set to generate a new block every 10 minutes approximately. So, the block get generated by miners, to do that, they need to prove that they worked for it. The proof of work is based on a hashing algorithm, basically the take the current block header’s hash, they add random data to it then they make a hash of the data. I suppose you know what a hash is, so i won’t explain that. Basically the hash result must be below a specific threshold, if it is, you found the new block, if it’s not, you need to search again. So imagine a sha1 hash where the first 10 bytes are 0x00000000000000000000 you must be extremely lucky to find such data that gives this kind of hash. Well that’s what the PoW (proof of work) is based on. You keep hashing millions of random data (which include the hash of the current block) until you find such a lucky hash that is below the threshold, thus proving that you did work hard to find the new block. You can see the blockchain here for the current block (at the moment) for example : https://blockchain.info/block-index/381396/0000000000000000d0f673f0241c7aca3f2453b165a2f70014362733e0daad81
you see in the top-right where it says hash/previous block/next block, you see all the 0000 it starts with. That’s the block’s hash which is below a specific threshold. The reward for finding such a hash/block is that when you create the new block, you will add a new transaction to it, the first transaction of a block is always a transfer of 25 btc from “nowhere” into your address. That’s your block reward. You can see it in the blockchain link I just gave you, it has all the transactions, and the first one has “no inputs” and has 25.07 BTC (if it shows $ value, click on the green button below the value to show it in BTC). So there you go, that’s how you mine coins and generate new ones. Now the thing is, what is that threshold, and what happens if you can’t find the hash. Well, the threshold is called “difficulty” in the crypto world and it’s automatically adjusted after every block (or every 10 block or whatever the currency creator decided when he made it), and it’s based on the average time needed to generate the block. So let’s say you have 10 miners, each with 1GH/s comutation power, so the network has 10GH/s and for a difficulty of “5” (let’s assume it means first 5 bits are 0), it takes an average of 10 minutes to find the hash. Now 100 new miners join the network with 10GH/s each, that’s 1000GH/s more to the network and the total network hashrate is now 1010 GH/s.. it will be a LOT easier to find that hash, so now it only takes 30 seconds to find it. But BTC spec says one block every 10 minutes, so the difficulty will increase to let’s say 13 to account for all the new hashing power, and now the hashes are found every 10 minutes.. some miners leave the network, difficulty goes down, etc… Of course, it’s not exact, it’s based on luck, but it’s “how probable that the next hash will be found in 10 minutes considering the current hashing power of the network”, sometimes with the same hashrate and same difficulty, it takes seconds to find the next block, sometimes it can take hours.. and yes, if it takes hours, then it takes hours, there’s nothing you can do about it, you just wait until it finds the block, then you lower the difficulty for the next one. No one actually sets the difficulty, it’s decided upon by the entire network. Everyone runs the same code, so everyone follows the same rules and agrees with each other. If for example someone doesn’t, then his hash/transaction/whatever will not be confirmed by other miners and it’s rejected. If two people find the next block at the same time, then one will get orphaned and the other will get confirmed, not sure how that works, but there’s some race condition/concurrency protection in the way confirmations are done. The same applies for transactions and accounts, if you send money to someone but you didn’t have the correct private key, then it won’t get confirmed by anyone else and the transaction is useless. That’s why whenever you do a payment or transfer, any respectable site will tell you they wait for X confirmation before unlocking the funds for example, it’s usually 6 confirmations, which can take a few minutes, it depends on the network and the hashrate and number of miners, etc…
So.. When you have an account, your private key is your wallet, and if you lose it, then you will have no way of signing any transaction for that account, meaning that the money is lost forever. that’s why it’s always very important to make a backup of your wallet.dat file somewhere safe, or to write it on a piece of paper, or something like that.. a lot of people have lost millions of dollars because their HDD failed and they didn’t have a backup of their wallet.dat. One even just threw it out because he thought it was useless, back when 1BTC was valued at 0.0001$.. and then when 1 BTC became 1300$, he regretted it..
Since BTC has a public ledger (the blockchain) and transactions are confirmed by the peers, and no one owns the network, then obviously, you can see the balance of any account you want (see previous blockchain and click on any address to see its full history and balance), that’s why some people will create a few new accounts (just generate a private key locally) for every transaction and split their funds through multiple accounts, this way someone seeing a transaction won’t know which of the destination is the one being paid and which one is the new account of the account holder. It is sometimes suggested to use one new account every time you make a transaction.. but I don’t really do that myself.
Now one last item, we talked about mining, but mostly about what we call “solo mining”, which is having your CPU or GPU calculating hashrates until it finds the right one and then you ‘win’ the 25BTC reward. But if you did that for real, you would never win it considering how many people are on the network, so what people do is use “mining pools”, which is basically a service that will send you much smaller computations to do and you give the result to them, and everyone joins the pool. When the pool is the one that finds the block, it will then share the reward proportionally with every miner depending on how many “shares” they sent.. so for example, here’s one of my shares for BTC in one of the pools I’m in :
Block          Value                     Status          Duration          Hash Rate         Your Shares                        Payout
292742      BTC 25.1399      43/120       13 minutes      10.02 Ph/s       5456/5000002904       BTC 0.00002743
So, you see the block id which yielded 25.1399 BTC, it has 43 confirmations (out of 120 required before the block is considered accepted/not-orphaned), it took 13 minutes to find it, the pool’s hash rate is 10.02 Ph/s (my rate is 11.02 GH/s), I sent 5456 shares out of a total of 5000002904 from the entire pool, and the payout is my portion of the 25.1399 BTC that was paid to me (yes, quite small for 11GH/s of hashing power, but consider the 10Ph/s of the pool… FYI, the network has 50PH/s).
Without mining pools, you wouldn’t be able to get anything.. I mean, sure I could try to find the block, and maybe I could and if I do, I’d win 25 BTC which is a LOT of money, but considering how huge the network is, it might take me years to find the block, or maybe I’ll never do.. so you join a pool and you share your luck with others. There are a few reward types a pool can use, either payout proportional to how much you contributed to that block, or proportional to the number of shares you sent in the past X minutes when a block is found (so if you’ve been on the pool for hours then you leave just before it finds a block, you still get something). anyways, not important.. the important thing to know is that if the pool isn’t the one that finds the block (there are a LOT of pools) then you don’t get anything. You can see the various reward types here : https://litecoin.info/Mining_pool_comparison#Reward_types
Oh and another thing, at specific blocks, the reward gets halved.. BTC started with 50 BTC reward, then at 100000th block, it became 25 BTC, it keeps getting halved until some point because, just like gold, the rarer it becomes, the harder it becomes to mine it. and the graph of number of coins in circulation should plateau towards the max, now jump right into it.

Anyways, now that the technical is out of the way, let’s talk about the theory of mining.
Mining bitcoin is impossible, not at home anyways.. usually a latest gen GPU will give you a few hunderd MH/s hashing SHA256, but GPU mining is so 2011, now everyone is using ASICs (Application-specific integrated circuit) which do a few TH/s easily. So any coin that is SHA256 is basically impossible to mine at home.. you would probably get about 0.01$ USD after a year of mining.. which would be less than your electricity costs. That’s where Litecoin (LTC) came into action! LTC uses Scrypt algorithm instead of SHA256 so the ASICs don’t support it, so it’s pretty much GPU exclusive, yeay! There are TONS of altcoins though, each with their own rules (how much reward per block, how much time per block) and their own specs (current difficulty, current exchange rate) which will be more or less profitable for you. The reference for me is coinwarz (http://www.coinwarz.com/) which I check every day because one very profitable coin today might be crap tomorrow. You can put your hashrate there (and electricity cost and power consumption of your GPUs) and it will tell you how much you will gain per day if you mined the coin. The problem is that Scrypt ASICs have just been released last week, so we’ll have people using ASIC for scrypt based coins soon… but they’re not that good, I mean, a 200$ ASIC gives you about 350KH/s which is slightly higher than a 200$ GPU which would give 300KH/s, but the nice thing is that it uses 2W of power, instead of 200W or whatever your GPU consumes.
There is also a new algo (kind of) called Adaptive-N Scrypt, which is just scrypt but with one of the constants made a variable (I think) which will make it hard to do an ASIC for it because memory requirements will increase everytime to prevent ASICs from catching up technology-wise. There’s also Keccak algo, but that’s only used by one coin, it’s called maxcoin and was released by Max Keiser, a financial journalist for RT (the tv channel). It was great, I made over 200$ with it pretty fast then value dropped so low that it was worth less than 20$.. thankfully, it went back up a little and I sold them and made 40$ back and lost 2 weeks of mining… if only I knew it would crash, I would have sold it when my balance was worth 200$.. but that’s part of playing the game! 🙂
This is another lesson, if you want to make it profitable, you mine something and you sell it right away into BTC (which is more stable).. I mined maxcoin because it was by far the most profitable, giving me about 20$ per day, but back then the value of one MAX was 0.01 BTC.. then it kept going down until it reached 0.0001 BTC. I managed to hold off selling until it started going back up then I sold it at 0.0004 just before it started dropping again, but if I had sold my mining revenue every day, I would have made a lot more money from it.
On the other hand, some things, you want to keep, for example, there’s Auroracoin (http://auroracoin.org) which is a coin that was created for the icelandic people who have huge financial issues and where 50% of all the coins are pre-mined and will be distributed to all the citizens of iceland, so I mined it and I thought that it would be awesome to have coins from this currency which might be widespread in an entire country.. but a couple of days ago they did the airdrop (where they allow icelandic citizens to claim their coins) and the value dropped..
Actually the value fluctuates depending on supply/demand.. if a lot of people are selling, then the price drops because you compete on the price in order to sell yours first.. if people want to buy, then price goes up. I suppose what happened with auroracoin is that people got their coins and just sold them in exchange for BTC since there was no infrastructure supporting AUR. But maybe in the future, it will start being accepted by merchants in iceland and people will start buying it and price goes back up. at least that’s the plan. Note also that it was basically a “free money for everyone” which goes against the whole idea of proof of work to get reward.. what do you do when you get free money? Note that there’s also now a SpainCoin and a GreeceCoin following the same 50% pre-mined for citizens principle, and I’m mining those as well (got 4 AUR, 20 GRCE and 68 SPA (sold 15 of SPA already when it was very high price)..)
There are also others than you want to hold onto because you know the price will go up, like for example, the very much liked and meme-based DOGECOIN! wouhouuu! ok, I do like dogecoin because it’s so popular. The dogecoin currency was created only as a joke by two guys, they never thought anyone would care, they made a logo using the shiba inu dog and used that meme as a base for the coin “wow, much coin, very transaction, etc..” and then they were wtf-ing when people actually started using it.. turns out it became extremely popular and value is sky rocketting. Only problem is that since it was a joke, all the coins will be mined in 10 months or so. but difference is that there is no limit on the amount of DOGE as opposed to other currencies, it will just become really small reward after the limit. The reward was random between 10 000 and 1 000 000 DOGE per block with a block time of 1 minute. Right now, I think they changed it to become a fixed amount because people were abusing the system by only mining doge when the next reward was the highest (since it’s a consensus, remember, it means the ‘random reward’ is not random at all, it’s based on an equation using the previous block’s hash as seed), which caused increased difficult on specific blocks and honest miners were only getting the small rewards. Anyways, it’s been halved once already so right now, it’s 250 000 DOGE per minute, and the price is 0.00000102 BTC. oh and yes, most exchanges are BTC to altcoin or altcoin to BTC, sometimes to/from LTC as well, then it’s USD/CAD/WHATEVER$ to/from BTC or LTC. so yeah, you do the math from your altcoin to BTC then according to today’s value of BTC, you know your altcoin’s worth in $.
So anyways, what’s special about DOGE is that it’s very popular, reddit mostly is making it the altcoin of choice, you always hear about it everywhere, like during the US regulations talks, they would talk about BTC and LTC (as the main coins) and they mentioned “or a coin based on a dog meme”. Doge is used a lot for tipping, charity, and all that, so you see a lot of causes evolving around doge, for example they raised 25000$ to get the jamaican bobsled team to go to the olympics, there’s doge4water (http://doge4water.org/) and just recently (last week actually, and funding finished yesterday) there was doge4nascar where they raised 50 000$ to sponsor Josh Wise car in NASCAR racing. http://www.doge4nascar.com/ and since they raised it, it’s been on every news outlet ( Fox news, the guardian, etc.. https://www.google.ca/search?q=dogecoin+josh+wise&safe=off&tbm=nws ) which is giving it a lot of exposure and popularity.. and just imagine that car with dogecoin logo on national television during nascar.. this will cause people to get interested in dogecoin, and to BUY dogecoin, which will cause the price to go UP! Also, in 30 days, the reward will be halved to 125 000 DOGE, which means the value kind of has to double in order to keep miners interested in mining dogecoin for profitability.. think about it, reward gets halved, means miners get half as much.. so they will stop mining it, if value doesn’t double, then they won’t be get enough and other coins will be more profitable.. of course, other possibility is that miners leave and the network hashrate (and so the difficulty) drops so those who remain get twice as much as before == same profitability… anyways even though dogecoin value has been dropping a lot lately, it is bound to go back up. It had already gained 10 times its value by the time I started mining it, unfortunately, I only had very little mined at that time (now I have around 145 000 DOGE in my wallet). On the other hand.. as someone recently told me “we don’t really see many parodies of gangnam style anymore” so maybe this hype around dogecoin is a bubble about to burst.. you never know!
That being said.. it’s a game, like stock market.. you “buy” (in this case by using your GPU’s time and electricity) coins and hope it goes up.. if you don’t hope that, then sell them right away.. then move away to the next coin, etc..
one very interesting thing is at coin launches! What happens in a coin launch is that there’s pretty much no one around, so if you get in just at the right time, you can get a lot really quickly. like for example, my auroracoin, I mined it too late.. my friend told me about it and I ignored him.. difficulty was maybe 100 or 200 (don’t know what it represents exactly, but it’s not bits of 0s) and he made a few AUR in a day on his home PC GPU (a cheap gpu which gives 60KH/s) then people got “wow” over it and started mining it, the value of it jumped and difficulty became 6000 (which is HUGE for scrypt based coins) so I started mining it and only got like 2 AUR after 5 days of mining on my mining rig of 2.1 MH/s… and my friend got “rich” quickly.. he sold and bought himself a new GPU to mine some more… So the idea is to find coins that will have a lot of exposure and impact, that people will like and you mine them from the very beginning before everyone else joins. Sure, you’ll still be a drop in the ocean.. but a 2MH/s drop in 200MH/s is better than a 2MH/s drop in 60GH/s network hashrate 🙂 You can see the network hashrate on coinwarz by the way.
So one interesting coin coming up is H2O https://bitcointalk.org/index.php?topic=494229.0 it looked good a lot of people wanted to jump into it, the launch was meant for march 24th, but they had bugs and delayed it.. rumor is that it’s going to be this friday.. if you want to mine it and sell, then that might be my first choice. Second choice would be the spaincoin or greececoin because their difficulty is really low so you can make a lot of them really fast, and if they get adopted, their value can be worth a lot, but only in the future.. also it could crash and value becomes 0.. and of course, dogecoin! but it might be better to buy dogecoin, its price is very low right now, 102 (meaning 1 DOGE = 0.00000102 BTC) and I think it will go to 200 in less than a month.. and after nascar ( in two months) it could double again. You could also mine it, see how much you can make according to coinwarz. Be aware you need to find a good pool, if you use a small pool, there’s a chance it won’t find blocks and you won’t get anything.. a good pool finds blocks more easily, but the pool’s hashrate is obviously higher so you get a smaller share, but often.. a smaller pool will give you a bigger share, but less often, your choice which way to go.
Another way to look at it is that once ASICs start selling (There’s already one ASIC on the market, but the “Titan” asic is planned for Q2/Q3 of 2013 and is supposed to be massive), and asics take over the scrypt network, then you’ll see a LOT of people with their GPU mining rigs having to shift their focus, either by selling their rigs (so expect cheap GPUs soon, by the way AMD just dropped their prices yesterday on newegg.com (not .ca)) or by making them mine a “asic-proof” coin.. and that’s where Scrypt-N comes into play.. my prediction is that soon everyone will move to scrypt-n coins (spaincoin switched recently from scrypt to scrypt-n in an update) which means the difficulty for scrypt-n coins will go through the roof. and Vertcoin is the one who stated it all, and is probably the scrypt-n coin with the highest difficulty and highest exchange value. My prediction is that when all gpus go to it, its difficulty will go up a lot and by consequence, the rewards will get smaller (harder to mine), which means that for it to be profitable, its value has to increase, so it will probably have a big increase in exchange rate (it already seems to be increasing steadily along with its difficulty).. so the vertcoins that I mined now (relatively easily, 5 VTC so far) will be worth a lot more in a few months (or a year).

You can go at this two ways, the hoarder mode with hopes that in 2 or 3 years, your coins will be worth millions, or a seller who will sell coins as soon as he mines them to make a profit right now.. or you can try to play the market, predict increases and decreases and all that.. You could switch from one altcoin to another, or concentrate on only one…

Now the last thing about this “theory” section is about exchanges.. obviously, altcoins can be exchanged for BTC and BTC for CAD or USD. and for that, you use exchanges, the most known one for BTC was MtGox which fucked up and closed and is under investigation and all that.. there are a few who had issues, but the one I use and most people seem to use is cryptsy.. there are other well known exchanges like Bter, bitstamp, mintpal, etc.. The way it works is that it will generate an address for an altcoin just for you, you transfer money to that address and it counts as a deposit into your account, then you can trade (sell/buy) and you can withdraw the money afterwards.. exchanges will have a balance for each of your altcoins so it kind of counts as a bank, but it’s not recommended to keep your money in anything other than your own wallet (and secure your wallet.dat). MtGox lost millions of $ and many exchanges got hacked and lost people’s money. Same rule goes for pools, when you create an account in a pool, first thing you do is set your address for transfers and enable auto-payout, you don’t want to have the pool hacked or shut down and all your mined coins still in the pool.
So, I suggest cryptsy for most of your stuff, I like it.. but while it supports a lot of currencies, it doesn’t do all of them, so you have to use others from time to time.. for example, to sell my SPA (spaincoin), I had to use Mintpal since cryptsy doesn’t do SPA. For GRCE, I only found bittrex and cryptorush that support it (for now), etc..
Remember when I said mining new coins is very profitable.. one issue though about new coins is that they will start with a high value on the market, let’s take GRCE (greececoin) for example, when it started it was valued at 0.008 (BTC per GRCE) and I was mining it, great.. now my first problem, the rewards I mined are unconfirmed in the network (to avoid an attack of the network if someone has 51% of the network hashrate, you need to wait 120 blocks after the current one before you can use coins generated in that block), so I still can’t use them.. need to wait for them to be confirmed.. by the time they were confirmed, value dropped to 0.003.. but now I can’t sell my coins because first, not many exchanges support it.. actually, only one at that time, secondly, no one wanted to buy it.. it’s still new, no one is interested, the only people who know about it are those who follow the “new altcoin announcement” threads on bitcointalk forums, and those are already miners who mined a lot of the coin and who also want to sell it, they don’t want to buy it… so if i try to sell it, it won’t work.. and when no one wants to buy and you want to sell, what do you do ? well, you sell your coins at a value lower than the market price, so as soon as a buyer comes in, he’ll buy your coins, not someone else’s.. and all the miners fight over that by lowering the market price.. by the time buyers are in, the price dropped 10 times.. and currently, it’s at 0.0001, so.. 80 times less than the initial market price. All you can hope for is that this new coin will be successful and people will like it enough that the price will eventually start going back up, and you are left with a lot of coins mined during the launch.

Ok, I think you got the theory, so now let’s talk about practice! There’s not a lot to say, you need a GPU (forget about cpu mining), it needs to be an AMD, because Nvidia SUCKS.. you can look at the hardware comparison to get a good idea : https://litecoin.info/Mining_hardware_comparison
You can use your own hardware like my friend did until you earn enough by mining to buy more GPUs, or you can buy a dedicated rig, which is what I did (and also use my desktop GPU for mining when not in use). For your info, if you buy a rig, it’s better than just buying coins because once it pays for itself, then it will keep generating free money.. but most importantly, at the end of the day, if all your coins are worthless, you still have hardware that has resale value. BUT whatever you invest in this, you have to be prepared to consider that money as “lost”.. in other words, don’t spend money you can’t afford to lose.
For reference, I “sacrificed” 1500$ and bought a rig consisting of :
Power supply + Motherboard + CPU + RAM
2 Asus AMD Radeon R9 290
The power supply is quite important because your rig will use a lot of power, and the motherboard, you’ll want one with as much PCI-E ports as possible.. mine has 4 PCI-E 16x and 2 PCI-E 1x so I could put 6 GPUs on it.. problem is that a GPU uses two slots because of their width, and I can’t fit 4 in there.. and that’s why you can buy PCI-E risers (look for it on ebay). Anyways, with 4 GPUs, you’ll most likely need a 1500W PSU.. what I did was buy 2 PSUs of 750W because then you can power 2 GPUs with one PSU and the other 2 with the second GPU and just force the second PSU to be always on (by shorting PS_ON with ground on the ATX connector). I bought two because PSU was on sale and even though I only need one, I thought might as well take the second one for half price :p
The GPU was on sale as well, normal price is I think 660$, I got mine for 540$.. now on the US newegg, price dropped to 430$ from what I saw yesterday.
I chose the R9 290 because a friend of mine said they are better than similar KH/s cards in terms of power consumption. I tweaked it until I got 860KH/s per card which is not bad.
I didn’t need an HDD because I used linux running off a usb stick. If you chose to install windows, you need an HDD though..
Here’s a nice guide which I followed : http://www.cryptobadger.com/build-your-own-litecoin-mining-rig/ that website also has other nice articles if you want to read through them.
So I used the SMOS distribution which you just write to a usb key (no install) and you boot it and it starts mining right away, you’ll just have to edit the config file to point it to your own pools : http://www.smos-linux.org/
And it gives you a web access similar to this : http://bamt.webboise.com/mgpumon/
Only issue I had with it is that it has an auto-donate feature where it will stop your miner for 15 minutes and it will mine into its own pools… which was a big problem for me because when I was mining maxcoin, I had to use a different miner application and I wasn’t using their service for mining, so it couldn’t “stop the server” so it ran two instances of the miner who froze the cards and I wasted 2 days without mining before I noticed… so if you want to disable the auto-donate, just do a ‘crontab -e’ as root and remove the scripts from crontab.
If you just want to use your desktop for mining, then you can use cgminer for AMD or cudaminer for Nvidia, but as you can see in the hardware comparison wiki, nvidia aren’t so good for mining. If you use cgminer, you must use version 3.7.2, because any version after that will *not* work for GPUs, as they dropped support for it. I use a fork of cgminer from “kalroth” (google kalroth cgminer) which has more option and bugfixes backported into 3.7.2. If you use scrypt-n, then you need vertminer (google it) which supports scrypt-n.
In theory it would take about 8 months to get back the price of a GPU from mining, but if the coin crashes, you lose the time you mined that coin, or if the coin gets a 10x increase in value, it could take only 1 month to get the money back.. you get the idea.
I’m not sure what other info I can give about the practical aspects of mining.. you get/have hardware, you get the miner, register in a pool, open the wallet, configure the pool in the miner (there are lots of instructions if you google for it) and start mining.. you put an auto-payout, and you hoard or sell depending on your preference.
For information, I’m using dogehouse.org as pool for dogecoin, and dedicatedpool.com for GRCE and SPA, and vertco.in for VTC.

Now for the last chapter of my book (lol). Mining BTC itself! Yes I know, I said earlier it’s impossible, but I actually said “impossible on a GPU”. I recently found this awesome site called cex.io in which you can trade BTC for GHS. It’s cloud mining but you don’t actually rent the GHs you buy them! Meaning that you get the GHS and you use it and mine from it forever or until you sell your GHS to someone else.. this means that you can buy 100GHS (would be around 600$ at current GHS price), let it mine for a while, which will give you about 6$ per day of revenue for 100GHS, then when you’re tired of it, you sell back those 100GHS. The issue here is two fold, the first is that the BTC network hashrate increases all the time, and the difficulty increases by about 30% per month I think, which means that the 6$ per day you get today, in one month will be much lower unless you buy more GHS.. The second issue is that since the value you can get from 1 GHS is lower every time the network hashrate increases, it means that the value, and so, the price of a GHS also decreases all the time, this means that when you sell back your GHS, you won’t get back the original amount of BTC (your 600$) you originally put in to buy them.. but hopefully, if you used it for long enough, then you will have earned those 600$ through mining already. I think it would take about 100 days to mine back your investment, then whatever you sell becomes extra money if you decide to sell your GHS.
The cool thing about cex.io is that you’re not renting the hardware, you’re buying it (or part of it), so if you have more GHS than what an ASIC miner provides, you can redeem the hardware and have it sent to you.. of course you’d have to pay electricty and maintain it yourself then, so not a great idea, but it’s still good to have that option and know that you actually do own the GHS that you bought. By the way, every earning you get from mining BTC, a portion of it gets taken to pay for electricity and maintenance, etc.. which is (according to them) about 13%, they have a complicated equation explained on the website, you pay per GHS you own and the time it’s been running (snice runs longer = more electricity use), this means that a portion of your earning goes into this maintenance fee, and the longer it takes to mine a BTC block, the higher the fee. Don’t forget, this is a pool so it’s a matter of luck.. I’ve seen it sometimes take a few seconds to find a block, or you find 10 blocks with 5 to 10 minutes between each block found, then you can spend 2 hours without finding any block.. the maximum fee I saw was about 50% and that was for the 2 hour delay before finding a block.. I guess overall it would average to 13% maybe.
If you want to give cex.io a try, I’d appreciate it if you use my referral link to create your account, because the referal gets 3% bonus GHS when a referred user trades GHS. So if you buy 10GHS, I’d get 0.3 extra (which I would lose if you sell, you can read the FAQ). Here is my referral link : https://cex.io/r/0/kakaroto/0/ please think about it when/if you decide to buy GHS to mine BTC directly.
The advantage of BTC is that its value is pretty stable.. sure, it fluctuates a lot, but it’s usually been between 550$ and 650$ in recent months, which is not like some altcoins (like maxcoin) which can see their value divided by 100 in a couple of weeks.
What I’ve done is use my GPU to mine altcoins, then I sold some of them (keeping AUR, VTC, GRCE, SPA and of course DOGE), and used that revenue to buy 11 GHS for myself and I’m leaving it mining now.. unfortunately, GHS price dropped in the past couple of months, so if I sell my 11GHS, then I’d have lost some money, but hopefully in a week or two, I’d be making a profit.
Another interesting point about cex.io is that you can ‘preorder’ GHS, they will add hardware on 26 of april and on 26 of may, and you can buy GHS for 0.008 for the april deployment and for 0.004 for the may deployment (currently active GHS can be traded for 0.011) so you could spend the money on the may GHS and triple the investment in 2 months… but on the other hand, it is possible that the GHS price drops by the 26th of may, and then you’d have bought GHS at the same price it would be selling on the day it becomes online, but you paid for it 2 months in advance instead of using those 2 months for mining with currently available GHS.. again, in this case, it’s a gamble. You can see the evolution of GHS value on cex.io for the past month to decide if it’s worth the risk.

Ok, I think that’s about it, I think I covered all the basics and advanced topics :-p

Don’t forget that if you decide to join cex.io, use my referral link here : https://cex.io/r/0/kakaroto/0/

Also, if you found this post interesting, you are more than welcome to send me tips to my BTC or DOGE wallets :

BTC : 15fXq5FzKaUArzQ8zBWfJADMn1qTQ5w5Y6

DOGE : DK4uNKX99VTgEv8BZdeepsYJHKhev3oR1j

Thanks for reading!

Posted in Educational | Tagged , , , , | Comments Off on Introduction to Cryptocurrencies

Eleganz release for Cobra ODE

Hi everyone,

It’s been a long time since I last blogged. Today I have some exciting news for you, as I have ported Eleganz, my homebrew manager, to the Cobra ODE.

A little while ago, I tweeted that if Cobra ever released their device and did provide an open source library for integration of other managers, I would port Eleganz to it, and today I am fulfilling that promise. I would like to thank the guys over at ps3crunch.net and ps3hax.net for testing this for me, particularly Abkarino, hyappon, freddy, magneto and Xodus69.

When I released Eleganz in November 2011, I left out one small thing on the TODO list, I wanted to see someone pick it up and add the code to exitspawn to actually make Eleganz execute the homebrew apps, but no one did that in almost a year now. I am a bit disappointed that the ps3 scene (homebrew devs, not users) didn’t pick it up, but it looked like no one was interested in maintaining Eleganz in my place. Today, I am happy to see that Eleganz is not throw-away code, as it can be useful to ODE users.

I can understand why Eleganz didn’t have much appeal in the world of CFW (it was originally intended to run on OFW if my HEN ever worked), but with the ODEs running on OFW, it’s perfect for the job. It’s simple, it’s beautiful and customizable!

Not only can Eleganz list the games from the Cobra ODE and allow you to select your iso, but it will also allow you to list and run homebrew apps that you can embed in the ISO file. This way you can get access to all your homebrew in a single place, without the need to restart the PS3 or boot the homebrew’s iso from the ODE. You can just extract the eleganz iso, and add homebrew apps (that are re-signed for running from a BD drive) to the iso’s PS3_GAME/USRDIR/HOMEBREW directory and recreate the iso with the cobra tool, and that’s it.

Note that this is not an indication of me getting back into the hacking scene. I have given up on the HEN long ago as I realized that there was no way (that I could find) to run homebrew on OFW, unless they are running from a disc. I may keep improving Eleganz in the near future, but I do not plan to do anything more than that for the ps3 scene at this point.

I would also like to tell everyone that there’s no need to worry, Eleganz will not become cobra-specific, as any feature I’d implement will benefit CFW as well as ODE users. I will be releasing an updated version for CFW users soon.

I’d also like to thank magneto and the Cobra team for offering to send me a Cobra ODE as a gift for porting Eleganz to it. Once I receive it, I plan on adding disc dumping capabilities to Eleganz and improve the user experience a little without relying on others to test it for me.

You can find the latest source code on github as always and compile it yourself or you can download the pre-compiled iso file from this link : http://www.multiupload.nl/GXBBI19VOL

I hope it gets used now and you all can enjoy it and I hope I can see some cool themes created for it now!

KaKaRoTo

Posted in Development, PS3 | Tagged , , , , , , , | 17 Comments

libnice 0.1.4 released!

Hey everyone,

I have just released a new version of libnice, the NAT traversal library.

Version 0.1.4 has a few bug fixes but the major changes are the addition of an SDP parsing and generation API.

You can now more easily generate your credentials and candidates and parse them with a single API call, making it much easier to exchange candidates and establish a successful connection.

Also, I have added three examples to the examples/ subdirectoy from the libnice source tree. Those examples should help anyone learn how to use libnice and what to do in order to establish a successful connection.

The first example, simple-example.c will create a new agent, and gather its candidates and print out a single line to paste on the peer. It uses the signals to asynchronously wait for events and continue the code execution.

The second example, threaded-example.c, will run the mainloop on the main thread and do everything else sequentially in another thread, waiting for signals to release the libnice thread to continue processing.

The final example, sdp-example.c, is based on the threaded example but uses the new SDP generate/parsing API to generate the candidates and credentials line to exchange between the two instances. It will base64 the SDP to make it all fit into a single line, making it easier to exchange the SDP between clients without having to parse the multi-line SDP in the example, keeping it small and concise.

I hope you will find this release useful, let me know if you have any comments.

You can get the latest version here and the documentation has been updated here.

KaKaRoTo

Posted in Development, libnice | Tagged , , , , , | Comments Off on libnice 0.1.4 released!

Eleganz: The Elegant Homebrew Manager

Hi everyone,

Last year, in January, I decided to have some fun and write a homebrew application using the EFL libraries. I decided to work on a homebrew manager.. basically a replacement to the XMB. It went really well, and the development was really fast, and it was all thanks to the awesome API and capabilities of the EFL libraries. However, I became busy and was unable to continue… also, it was a bit slow and without proper hardware acceleration, it wouldn’t be as good as I hoped for, so I put the project on the side.
After many months, in September, thanks to gzorin’s work, we finally had a working and usable GL implementation and the EFL apps automatically gained from it by becoming hardware accelerated. My homebrew manager was much better! but I still needed to finish a few things and I didn’t have time so I put to rest again.

Today, I have decided to release this homebrew application, *as is* for everyone’s enjoyment! This means that it is not fully working, it might still have some bugs here and there, but it is still a homebrew app that people can use and have some fun with. Most importantly it will serve 4 purposes :

  • Maybe re-awaken  this dying PS3 homebrew scene
  • Be a good “exercise to the community” for finishing it up
  • Be a good example of what can be done with the EFL
  • Bring non-developers into writing EFL themes for the app

 

I introduce to you, Eleganz! The Elegant Homebrew Manager! A little homebrew app that lets you install pkg files and run your games directly from it. Here is the mandatory screencast video :


YouTube Link toEleganz screencast

 

I have published my app in both github and on ps3dev’s gitorious. and you can also download a pre-compiled .pkg for your PS3 to have fun with it.

Here are some highglights of the application (features, limitations and bugs) :

  • The whole User Interface is completely customizable with themes
  • Installs .pkg files locally to its own data directory (won’t be visible in the real XMB, unless someone reverses the database format)
  • Does not yet run games (it’s for you to do it, use ps3load as reference maybe…)
  • Current theme is missing proper theme/images for the progressbar windows (default exquisite/E17 theme used)
  • System freezes for a few milliseconds when it tries to load a game’s background image (might be fixed if we implement a pthread library and threading support in the EFL)
  • Apparently crashes when it exits (bug)

The homebrew app comes with two themes, a dark and light theme. I like the dark one so I chose that as the default (oh, ignore that grey background ‘default’ one from that screencast video, that was just for testing). I wrote the user interface for the theme (the Edje files) while opium designed all the graphics. The theme engine in the EFL is extremely powerful, so I hope I will see tons of themes popping up. And I do not mean “change the images” themes, I want real themes, where the whole UI is different, a vertical XMB, a circular one, a 3D theme with perspective/depth for the icons, a dynamic/moving background, etc… You can learn about the .edj/.edc file format here and don’t forget to check the EDC reference wiki.

I hope to see the community pick this up and have fun with it!

That’s about it, enjoy it, and send me your patches! I’ll be waiting 🙂

KaKaRoTo

 

p.s: Forgot to say that the rules/naming conventions/etc.. of the EDC files are explained here. If a .edj file doesn’t have the appropriate parts/groups, then it will be ignored and will not show on the UI.

p.p.s: You can install the EFL on windows and have access to edje_cc to compile your .edc into .edj.

p.p.p.s: Damn, I  keep forgetting stuff.. by the way, the whole Eleganz application works just fine on the PC too, I did all my development on the PC (that screencast was actually on Linux), *then* I tried it on the PS3 and it just worked.. so for theme development, it should be pretty easy to test without the need of a PS3.

Posted in Development, EFL, PS3 | Tagged , , , , , , | 14 Comments

UVC H264 Encoding cameras support in GStreamer

More and more people are doing video conferencing everyday, and for that to be possible, the video has to be encoded before being sent over the network. The most efficient and most popular codec at this time is H264, and since the UVC (USB Video Class) specification 1.1, there is support for H264 encoding cameras.

One such camera is the Logitech C920. A really great camera which can produce a 1080p H264-encoded stream at 30 fps.  As part of my job for Collabora, I was tasked to add support for such a camera in GStreamer. After months of work, it’s finally done and has now been integrated upstream into gst-plugins-bad 0.10 (port to GST 1.0 pending).

One important aspect here is that when you are capturing a high quality, high resolution H264 stream, you don’t want to be wasting your CPU to decode your own video in order to show a preview window during a video chat, so it was important for me to be able to capture both H264 and raw data from the camera. For this reason, I have decided to create a camerabin2-style source: uvch264_src.

Uvch264_src is a source which implements the GstBaseCameraSrc API. This means that instead of the traditional ‘src’ pad, it provides instead three distinct source pads: vidsrc, imgsrc and vfsrc. The GstBaseCameraSrc API is based heavily on the concept of a “Camera” application for phones. As such, the vidsrc is meant as a source for recording video, imgsrc as a source for taking a single-picture and vfsrc as a source for the viewfinder (preview) window. A ‘mode’ property is used to switch between video-mode and image-mode for capture. The uvch264_src source only supports video mode however, and the imgsrc will never be used.

When the element goes to PLAYING, only the vfsrc will output data, and you have to send the “start-capture” action signal to activate the vidsrc/imgsrc pads, and send the “stop-capture” action signal to stop capturing from the vidsrc/imgsrc pads. Note that vfsrc will be outputting data when in PLAYING, so it must always be linked (link it to fakesink if you don’t need the preview, otherwise you’ll get a not-linked error). If you want to test this on the command line (gst-launch) you can set the ‘auto-start’ property to TRUE and the uvch264_src will automatically start the capture after going to PLAYING.

You can request H264, MJPG, and raw data from the vidsrc, but only MJPG and raw data from the vfsrc. When requesting H264 from the vidsrc, then the max resolution for the vfsrc will be 640×480, which can be served as jpg or as raw (decoded from jpg). So if you don’t want to use any CPU for decoding, you should ask for a raw resolution lower than 432×240 (with the C920) which will directly capture YUV from the camera. Any higher resolution won’t be able to go through the usb’s bandwidth and the preview will have to be captured in mjpg (uvch264_src will take care of that for you).

The source has two types of controls, the static controls which must be set in READY state, and DYNAMIC controls which can be dynamically changed in PLAYING state. The description of each property will specify whether that property is a static or dynamic control, as well as its flags. Here are the supported static controls : initial-bitrate, slice-units, slice-mode, iframe-period, usage-type, entropy, enable-sei, num-reorder-frames, preview-flipped and leaky-bucket-size. The dynamic controls are : rate-control,  fixed-framerate, level-idc, peak-bitrate, average-bitrate, min-iframe-qp,  max-iframe-qp, min-pframe-qp,  max-pframe-qp, min-bframe-qp,  max-bframe-qp, ltr-buffer-size and ltr-encoder-control.

Each control will have a minimum, maximum and default value, and those are specific to each camera and need to be probed when the element is in READY state. For that reason, I have added three element actions to the source in order to probe those settings : get-enum-setting, get-boolean-setting and get-int-setting.
These functions will return TRUE if the property is valid and the information was successfully retrieved, or FALSE if the property is invalid (giving an
invalid name or a boolean property to get_int_setting for example) or if the camera returned an error trying to probe its capabilities.
The prototype of the signals are :

  • gboolean get_enum_setting (GstElement *object, char *property, gint *mask, gint *default);
    Where the mask is a bit field specifying which enums can be set, where the bit being set is (1 << enum_value).
    For example, the ‘slice-mode’ enum can only be ignored (0) or slices/frame (3), so the mask returned would be : 0x09
    That is equivalent to (1 << 0 | 1 << 3) which is :
    (1 << UVC_H264_SLICEMODE_IGNORED) | (1 << UVC_H264_SLICEMODE_SLICEPERFRAME)
  • gboolean get_int_setting (GstElement *object, char *property, gint *min, gint *def, gint *max);
    This one gives the minimum, maximum and default values for a property. If a property has min and max to the same value, then the property cannot be changed, for example the C920 has num-reorder-frames setting: min=0, def=0 and max=0, and it means the C920 doesn’t support reorder frames.
  • gboolean get_boolean_setting (GstElement *object, char *property, gboolean *changeable, gboolean *default_value);
    The boolean value will have changeable=TRUE only if changing the property will have an effect on the encoder, for example, the C920 does not support the preview-flipped property, so that one would have changeable=FALSE (and default value is FALSE in this case), but it does support the enable-sei property so that one would have changeable=TRUE (with a default value of FALSE).

This should give you all the information you need to know which settings are available on the hardware, and from there, be able to control the properties
that are available.

Since these are element actions, they are called this way :

  1. gboolean return_value, changeable, default_bool;
  2. gint mask, minimum, maximum, default_int, default_enum;
  3.  
  4. g_signal_emit_by_name (G_OBJECT(element), "get-enum-setting", "slice-mode", &mask, &default_enum, &return_value, NULL);
  5. g_signal_emit_by_name (G_OBJECT(element), "get-boolean-setting", "enable-sei", &changeable, &default_bool, &return_value, NULL);
  6. g_signal_emit_by_name (G_OBJECT(element), "get-int-setting", "iframe-period", &minimum, &default_int &maximum, &return_value, NULL);

Apart from that, the source also supports the GstForceKeyUnit video event for dynamically requesting keyframes, as well as custom-upstream events to control LTR (Long-Term Reference frames), bitrate, QP, rate-control and level-idc, through, respectively, the uvc-h264-ltr-picture-control, uvc-h264-bitrate-control, uvc-h264-qp-control, uvc-h264-rate-control and uvc-h264-level-idc custom upstream events (read the code for more information!). The source also supports receiving the ‘renegotiate’ custom upstream event which will make it renegotiate the according to the caps on its pads. This is useful if you want to enable/disable h264 streaming or switch resolution/framerate easily while the pipeline is running; Simply change your caps and send the renegotiate event.

I have written a GUI test application which you can use to test the camera and the source’s various features. It can also serve as a reference implementation on how the source can be used. The test application resides in gst-plugins-bad, under tests/examples/uvch264/ (make sure to run it from its source directory though).

 

uvch264_src test application (click to enlarge)

You can also use this example gst-launch pipeline for testing the capture of the camera. This will capture a small preview window as well as an h264 stream in 1080p that will be decoded locally :

gst-launch uvch264_src device=/dev/video1 name=src auto-start=true src.vfsrc ! queue ! “video/x-raw-yuv,width=320,height=240,framerate=30/1” ! xvimagesink src.vidsrc ! queue ! video/x-h264,width=1920,height=1080,framerate=30/1,profile=constrained-baseline ! h264parse ! ffdec_h264 ! xvimagesink

That’s about it. If you are interested in using uvch264_src to capture from one of the UVC H264 encoding cameras, make sure you upgrade to the latest git versions of gstreamer, gst-plugins-base, gst-plugins-good and gst-plugins-bad (or 0.10.37+ for gstreamer/gst-plugins-base, 0.10.32 for gst-plugins-good and 0.10.24 for gst-plugins-bad, whenever those versions get released).

I would like to thank Collabora and Cisco for sponsoring me to work on this great project, it couldn’t have been possible without them!

If you have any more questions about this subject, feel free to contact me.

Enjoy!

 

Posted in Development | Tagged , , , , , , , , , , , , | 8 Comments

RSXGL working and usable

Hi everyone!

When the PS3 homebrew scene started, a lot of people were complaining that it wasn’t possible to write 3D games for the PS3 because of its lack of OpenGL library.
Almost a year ago, Alex Betts thought he would tackle this problem and he started working on RSXGL… an implementation of the OpenGL 3.1 specification written from scratch targeting the PS3’s RSX. Anyone in their right mind would say that it’s impossible, that it’s too much work, but Alex spent the last year working on it, alone, until it became usable. You can read some news about it here.
For some reason though, no one used it to build their own apps. Maybe the status of the project was scaring them, it was said to be incomplete, there was no GLSL support, etc…
I am writing today to tell you that RSXGL is perfectly usable! It supports online GLSL compilation, as well as any feature you might want. As proof, I have written a new hardware accelerated engine for the EFL using RSXGL and it worked great! Alex and I spent a lot of time testing and fixing all the issues that were in RSXGL that were made visible by the EFL’s GL engine and I am happy to say that it’s working now. Expedite is able to run most of its tests at 50 to 60 fps on 1080p resolution (instead of the average of 5 to 10 fps it had on 720p before).
You can see performance tests right here (Running some tests from expedite) :
Software rendering: http://dl.dropbox.com/u/22642664/expedite_psl1ght.log
RSXGL rendering: http://dl.dropbox.com/u/22642664/expedite_rsxgl.log

Please give RSXGL a try. Also, you can get the latest EFL version from my repository, which includes the gl engine for ps3. Now, any EFL application will be automatically hardware accelerated thanks to RSXGL. I hope we can see some new games (or old GL games being ported) soon!

RSXGL : https://github.com/gzorin/RSXGL
EFL : https://github.com/kakaroto/e17

Enjoy!

Posted in Development, EFL, PS3 | 29 Comments

Exquisite tool becomes a library!

The exquisite tool that comes with Enlightenment is a nice, pure edje application that is used for showing boot process splash screens. I thought it was a nice splash screen and more generally, a nice progress bar and wanted to use it in my own apps.

I have modified the exquisite tool to become a library so it can be used by others in their applications, while keeping the exquisite and exquisite-writer tools intact (they will now depend on libexquisite.so though).

Since it’s a very simple feature (only a couple hundred lines of code), the API is simple as well. Here’s an example of use :

  1.   Evas_Object *obj = exquisite_object_add (evas, theme);
  2.   evas_object_show (obj);
  3.  
  4.   exquisite_object_title_set (obj, "Title of the screen");
  5.   exquisite_object_message_set (obj, "My Message");
  6.  
  7.   int test_id = exquisite_object_text_add (obj, "First test");
  8.   int second_test = exquisite_object_text_add (obj, "Second text!");
  9.   exquisite_object_status_set (obj, test_id, "FAIL", EXQUISITE_STATUS_TYPE_FAILURE);
  10.   exquisite_object_status_set (obj, second_test, "OK", EXQUISITE_STATUS_TYPE_SUCCESS);
  11.  
  12.   exquisite_object_pulsate (obj);
  13.   exquisite_object_progress_set (obj, 0.95);

That’s it, and you get a nice screen with title, message, text area for status messages (if you want it), and a progress bar. This also means you can use the default theme from exquisite or write your own using the same theme specification which can later on be used by others.

Please review the API provided, and this is the right time to suggest any changes to the API or to improve on the library, so let us know what you think!

Posted in Development, EFL, PS3 | Tagged , , , | 6 Comments

Upgrading from Fedora 15 i686 to Fedora 16 x86_64

A couple of months ago I bought a new laptop with 8GB of RAM, but I realized I was running on a 32 bits system which meant I couldn’t use all my RAM. I had to switch to 64 bits. It takes so much time for me to restore my system that I didn’t have the courage to go through it again (did it last year, switched from Debian to Fedora, took me a week), so I stayed with 32 bits. Yesterday I had to upgrade to Fedora 16 and decided to do the switch to 64 bits at the same time… I’d like to share my experience with you!

First of all, I had to download the 64 bits version of the fedora CD which is not the default download on the website, I had to click on the small “more download options” to get the choice and I realized that’s how I got the 32 bit  install in the first place (Fedora download page should definitely list both links). Then I made a backup of all the installed packages on my system so I can restore them on the new system :

 yum -C info $(rpm -qa) | grep “Name   :” | cut -c 15- > packages-list.log

This will list all of the packages installed, and ask yum for the exact name of the package (instead of “git-1.7.6.5-1.fc15.i686”, it becomes “git”).. if you have a better method of doing that, let me know, but this did the trick for me.

Update: A better method was given to me by Hansen and Richard Godbee in the comments : rpm -qa –qf “%{name}\n” > packages-list.log

I obviously had a separate partition for the  /home directory, which made things easier, so I backed up in it the important directories which were: /opt, /root, /etc, /usr/local and my scratchbox home dir. Then the moment of truth, reboot into the live cd, install it, make sure not to format the /home partition, and reboot into the new 64 bits system.

First of all, as soon as I tried to login, gnome 3 would completely crash and would not let me log in, so I had to create a new user, login into gnome 3, then “ls -la” the files in the new user’s home dir, then delete (move away) those same files/directories from my own home dir, so that gnome doens’t crash anymore… apparently, my settings suddenly became incompatible or something… It’s important to note that I had some further problems later and I had to copy back .gnome2/keyrings otherwise the gnome-keyring daemon would freeze.

To restore all the packages that I had before, I first had to re-install (manually) the rpmfusion repository (free and nonfree), then I just did a simple :

yum install $(cat packages-list.log)

And after 1.2GB of downloads and 1020 package installs, my system was technically “restored” to how it was before the format. I look at the “No package foobar” lines given by yum at that point which told me what I needed to install manually (opera, skype, dropbox), which I did, and a few libs that apparently don’t exist anymore in Fedora 16. Now I just had to restore the /opt for some apps I had in there (and recompile the EFL/E17),  copy the Enlightenment.desktop file to /usr/share/xsessions, restore my /etc/hosts (which had some custom entries), restore some custom scripts I wrote into /usr/local/bin and recompile the libraries I was working on and had installed in /usr/local (gstreamer, libnice, farstream). I also had to install a few 32 bit libraries so I could install skype (which only comes in 32 bit flavor).

It took me about a day of work/compilation, but now I feel back home, don’t notice any difference in my system other than the fact that I will now be writing 32-bits bugs instead of 64-bits bugs 🙂

 

 

Posted in Uncategorized | Tagged , , , , | 12 Comments