RECOGNIZING C CODE CONSTRUCTS IN ASSEMBLY


DATE: Fri Jan 8 21:29:38 CST 2021

I am very happy to get my very first internship. It is not a stop, but a beginning. You got this Kai!

Theory: it is all about finding the patterns..


Guess:

  1. when the var is outside of the function, the var will be mov into the memory itself and have its own address. will not be affected by functions ret.
  2. when it is in the function, the vars will be mov into the frame with offset to EBP, and when function not used anymore, the memory frees itself. delete the vars

cdecl:

most popular convention call.

In cdecl, parameters are push onto the stack from right to left, the caller cleans up the stack when the function is complete, and the return value is stored in EAX

int test(int x, int y, int z);

int a, b, c, ret;
ret = test(a, b, c);
push c
push b
push a
call test
add esp, 12
mov ret, eax

notice the push order, because of the function will need to do Pop

stdcall

Popular stdcall convention is similar to cdecl, except stdcall requires callee(the function) to clean up the stack. This is also the standard calling convention for Windows API.


Fastcall

the first few arguments are passed in registers, with the most commonly used registers being EDX, ECX. addional argument will be loaded from right to left. responsible for clean up.


Push Vs. Move

guess: push is using less bytes in terms of instructions code.

Usually when the compiler decides to move rather than push things onto the stack.


https://s3-us-west-2.amazonaws.com/secure.notion-static.com/0e3e80b2-a42d-4d57-bc9d-6bd9ec71e417/Untitled.png

https://s3-us-west-2.amazonaws.com/secure.notion-static.com/facb0541-337a-4354-a3cd-e2a366c72667/Untitled.png

https://s3-us-west-2.amazonaws.com/secure.notion-static.com/20065ea1-0f7c-44b5-8c58-808bdf36f5ad/Untitled.png


Switch statement

https://s3-us-west-2.amazonaws.com/secure.notion-static.com/3ad7d560-3ee2-48f8-8b8d-191a37854435/Untitled.png

https://s3-us-west-2.amazonaws.com/secure.notion-static.com/ee90557f-6ba3-4b43-b9a2-d267ae8fee62/Untitled.png



The more efficient assembly code using jump table.

https://s3-us-west-2.amazonaws.com/secure.notion-static.com/9e3b9853-153a-4d8c-94b3-983a1cf4ed52/Untitled.png


Disassembling Arrays

Guesses:

arrays are going to be putted in to memeory as a sequence of bytes.

But linked list are going to use two bytes, to store: 1. value. 2.Node address pointer which points to another node. 3.does not need to be adjacent to other nodes in the memory.

https://s3-us-west-2.amazonaws.com/secure.notion-static.com/ab69725b-1fe9-42c5-b674-6949dbe3e8a2/Untitled.png

https://s3-us-west-2.amazonaws.com/secure.notion-static.com/7881eeea-e058-4c40-9942-67644e3de350/Untitled.png


Identifying Structs

https://s3-us-west-2.amazonaws.com/secure.notion-static.com/f6afbac7-7e81-4a04-ac0a-8a9974d90e0e/Untitled.png

Main:

https://s3-us-west-2.amazonaws.com/secure.notion-static.com/e9bf2715-4335-4a92-a1e9-3f6c03118281/Untitled.png

Test()

https://s3-us-west-2.amazonaws.com/secure.notion-static.com/bc6a6245-839e-470c-852b-b44326b55ca3/Untitled.png

the flow of test(): take the gms pointer* and mov into eax, assign value 61h—— 'a' into [eax+14h], and mov pointer to ecx, move double into address [ecx+18h]. mov ebp+var_4 to 0. jmp into the loop.

in the loop. comparing i to 5. if it is less than 5, mov i into eax, mov pointer in ecx.

mov i again into edx. note this edx is going to be used as data input for array[i]. mov [ecx(pointer) + eax(index) * 4], edx(i). and jmp back to conditions where mov edx is the i and add 1 to it and move it back into [ebp+var_4] for the next comparison.


Linked List Traversal

The linked list consists of a sequence of data records, and each record includes a field that contains a reference→ link to the next record.

#include <stdio.h>
#include <stdlib.h>

struct node{
  int x;
  struct node * next;
};

typedef struct node pnode;

int main() {
  pnode * curr, * head;
  int i;
  head = NULL;
  for (i=1; i<=10; i++){
    curr = (pnode *)malloc(sizeof(pnode));
    curr->x = i;
    curr->next= head;
    head= curr;
  }
  curr=head; 
//curr = head finish the link.
  while(curr){
    printf("%d\\n", curr->x);
    curr = curr->next;
  }
  return 0;
}

//more like 1<-2<-3<-4<-5<-6<-7<-8<-9<-10
// so the result is 10,9,8,7,6,5,4,3,2,1 

https://s3-us-west-2.amazonaws.com/secure.notion-static.com/548850b3-81cb-4bb5-b190-853d06883118/Untitled.png

code flow: var_4 is the 4 bytes pointer to struct curr, var_8 is pointer to head.

https://s3-us-west-2.amazonaws.com/secure.notion-static.com/4a707b64-3949-46bf-9ba3-63855341821d/Untitled.png