Inversing: 2017

Wednesday, November 22, 2017

Mitigating DLL MITM - Manual DLL Importation

Hello, guys!

In my last post I brought a post about DLL Hijacking / Man-In-The-Middle technique with DLL importation. In the end of the post I wrote some countermeasure methods to mitigate this type of hijacking. The method I describe here is efficient and a little tricky for RE, but of course it's "ugly" and manual. It's just an idea, trying to think creatively.

Basically instead of using the IAT to reach the function on DLL you have to manually set the address of the function. To do that we only need the offset of the function in the DLL, get the handle of the DLL module and sum with the offset.

I will use the same example of the last post. You can get the function offset in many ways. In this example I just grab the address from the IAT to get the offset of the function.

With the address we can code in the dll-consume a dynamic function call.

As you can see I create a function-type to create an object of that function-type and then I dynamically set the address of function. This way I know exactly that will go to my function. If I try to load the DLL Fake the program will crash, because Fake image doesn't have the function in the same address. I renamed the function because I imported the windows.h and this header already have a send function.

The only way to bypass this is patching the application or generating an DLL with this function in the same address.

Conclusion

In conclusion we just have to be creative to mitigate flaws in software. Always thinking both in offensive and defensive. Try to break your own programs and at the same time how to mitigate it. Anything at all just contact me. Thanks! :)

Sunday, November 12, 2017

DLL Hijacking - DLL Man-In-The-Middle

Hello, guys!

I hope that everybody is doing fine! :)

I was working in this PoC on my free time and now I could get enough time to bring the technical of it. It take a while because I first wanted to bring some introductory posts before writing the technical stuff. So let's do it!

First of all I hope that you had read my last posts to a better understanding of what I will show here, because I will not explain in details everything. So the prerequisites for this post is:

What I utilized to do this ?

Github Project

dll-manual-hijack

Overview

I have been studying more in-depth the PEB hooking (this post is not about PEB hooking) to learn more thoroughly how Windows Internals works. My way to learn is coding, practicing. I like to brush bits :). Basically the PEB hooking stole a module from some process, changing addresses in the LdrModules located in the PEB. But for this I needed a DLL that exports the same functions that the original do.

To export all functions from a DLL I coded the "DLL Exporter" tool to extract the functions exported by some DLL. It access the export directory and generate 2 files for NASM. I included the code generated in the main code of the fake image DLL ASM. Of course for this example it doesn't matter, because I only exported 1 function, anyway it worth to take a look on it to see how it works.

In this example I coded a program that simulate a message exchange, there is no socket connecting to keep it simple. But you got point right ? The program passes the message to a DLL that encodes it and hypothetically would send encoded message to somewhere.

To do this PoC I made a fake DLL that exports the same function of the original. This fake DLL grabs the message, log it in a file before encoding it and send the execution to the original one, without injecting any code neither in the executable nor original DLL, just rewrites the IAT of fake DLL. I just named the fake image file as the same name of the original image, so when the Windows Loader searches for the DLL it will load the fake one.

Problems

Theoretically I just have to leave the windows load the fake DLL and inside of the fake image I load the original DLL, then I can call all functions that belongs to it. I could do this just with LoadLibrary and GetProcAddress, right ? But imagine a DLL with a LOT of functions, the work I would to do to make everything callable would be huge and painful.

So what I did was to link the original DLL to my fake DLL and there is where I found the problem. When the program loads the fake DLL consequently the original DLL is loaded, since the fake image has the same name as the original image, windows loader will end up linking fake image to itself and a Deadlock is set. So when the fake image is called by the process it can't jump to the original image and keeps calling itself forever. To set everything in the right place I had to alter the IAT (Import Address Table) of the fake DLL.

Solution

As a solution to this problem I had to rewrite the IAT with the original function addresses and I did it with NASM. At this point if you don't know what is a IAT, please read my previous topics and all the references I gave there. The solution is to load the original DLL image, save the AddressOfFunctions (Export Directory), save the IAT (FirstThunk - Import Directory), change the IAT memory protection (protection against writing), then I just iterated through the AddressOfFunction rewriting the addresses to the IAT by the addresses of AddressOfFunctions. Now all my imported functions in the fake image DLL will call the original image DLL functions.

Spliting the Project

I will give a brief explanation about the source code. I left comments in the source to help in the reading, but the code itself is self explanatory. Anyway any doubt or question just contact me. In the next posts I will bring CMake files to make easy to compile, for now I just make some PowerShell scripts to do it in each folder.

Folders

This folder have the code of the dllsend.dll.

dll-consume

This folder contains the code of the program to consume the dllsend.dll.

dll-exporter

This folder contains the program that receives a DLL in the command and generate two files. dll_parameter.dll.ext and dll_parameter.dll.subs.

dll_parameter.dll.ext contains the "extern" code of all exported functions to append in the ASM code.
dll_parameter.dll.subs contains the function declaration to append in the ASM code.

dll-fake-asm

This folder contains the body of the fake DLL image. I coded all the process using NASM and GoLink to link the compiled object with the original DLL image.
make.ps1 has the logic to link the fake DLL image.
In the GoLink you have to inform all the exported functions. In this case you can adapt the DLL Exporter to generate this part easily.

Walking through the process

I will assume that you have compiled everything and I will just walk to the main parts of the process. The dll-consume is very straight forward, you type your message, then it back to you with the encoded string and the it's hex dump.

dll-consume compiled

So as I mentioned before I just renamed the original DLL adding a "2" in the end. So let's see what happen inside of this process. You will see that since the fake DLL is pure ASM code it's size is tinier than the original image DLL compiled with GCC.

List of files

So I already mentioned before how to find the OEP (Original Entry Point) of the programs compiled in GCC. You can find the pattern that GCC uses to do this or you can go directly in the inter-modular calls made by this module m.exe. This is the main function:

Main function of m.exe

As I always said, take a moment to walk-through the program, it's important to train your eyes with assembly code. In the image above you will see the IAT of the m.exe module in the line that calls the send function where you see a jump to imported function. At this point in debugging you can see all the modules loaded in the memory tab. Since original DLL was loaded in the entry function of my fake image DLL it is already loaded in the memory tab.

Memory mapping

The main/entry code of the fake image DLL already was executed when the Windows Loader loaded the fake image. To step through this part in the debugger you can check Dll Entry option in the preferences.

Dll Entry - Debugging

I already set a breakpoint in the send call, then I stepped into until I get the fake send function.

As you can see it's pretty straight forward. I imported some C functions to save the message in the file log.bin. Opened a handle to the file, wrote to it and closed the handle. After the log process, I restored the register context and jumped to the original send function.

I could enhanced this fake send function more to make the log better, but it useless. To practice ASM, try to enhance it inside the fake DLL code. Make a breakline in each message and save it to the log.bin, if you analyze the code you will see that is easy.

Fake function 'send' disassembled

I will not step-by-step through the IAT rewriting, I leave to you to debug it. Open the DLL on CFF Explorer and debug the entry function of fake DLL. The code itself has a lot of comments. Any help that you may need just contact me, I would be glad to help. ☺

Countermeasures ?

There are some countermeasure against it, you just have to be creative about it. For this you could check the MD5 of the DLL before call it or you could count how many modules are loaded in the process memory space. As you saw in the walk-through there is a exceeding module dllsend2.dll and with this information you can define if your application has been compromised or not.

Anyway it's hard to fight against these things. In RE world, programmers have to do all they can to make the work of reversers as hard as possible in order to avoid the RE itself.

Conclusion

In computer world everything is possible it's just a time to understand how to do it. I did this PoC for learning purposes and a better understand the Windows Internals. I hope that with this technical example you can learn a little with it too. Thanks for reading.

I didn't walked through the loading process of fake image DLL because I think my comments in the code are pretty self explanatory. Anyway any questions feel free to contact me.

Saturday, November 4, 2017

PEB - Process Block Environment

Hello again! ✌

First of all I do apologize for any mistake that I might made here. I'm always open to opinions and tips. I hope this can be useful to somebody.

In this post I will try to introduce you to the concept of PEB (Process Environment Block) that is a structure in the Windows's processes containing very important information, it's important to know it better. But before to get into it I will try to introduce you to some basic concepts like, what is Process, Thread and Jobs. We need to know the difference, because they have different structures.

Like I always said, I will keep this post simple and objective. In the end of the post I will leave good references so you can do your own in-depth researches. If you really want to know the Windows you need to get the Windows Internals book it is a must to know more this operating system.

Processes

The concept of process and programs are important to understand. A process is not a program, programs are sequences of instructions and processes are containers where the programs are included along with other properties and resources. So in order to run a Window's program a container is needed, where it can run freely with everything it needs is already within. Windows certifies that.

A Windows process comprehends basically:

Private Virtual Address

A private virtual space in memory to allocate everything the program may need to run.
Processes can't access others processes's virtual memory space if there is nothing allowing it.

Program

The program is the part of the process that has instructions to the processor execute.

List of Handles

Handles are pointers to resources allocated in memory utilized by the program like Synchronization objects (Basically are shared objects between threads), opened files and so on.

Security Information

All security information about the process like, user, group of user, privileges in the system, User Account Control (UAC), session and so on.

Process ID

Unique identifier of the process.

Threads

At least one thread will be running for any given process.

If you read my last post about PE files you should can correlate in some way processes with PE files in-memory. When attempt to create a new process in the system with one very known API like CreateProcess, it expects in the first parameter (lpApplicationName) a valid PE file to load it in memory, even if the file doesn't have the extension .exe.

It's important to say that CreateProcess is an API of Windows, but it is a high-level one so to speak. For example we have default processes that are created in the beginning of boot process of Windows, like Smss.exe. Smss (Session manager) isn't created by CreateProcess, instead it is created by NtCreateUserProcess (Ntdll) since it is created directly by the Kernel.

There is a difference between native processes and normal proceses that are created after the Windows boot initialization. These native processes like Smss.exe, Csrss.exe and so on cannot be created as a common process, the CreateProcessInternalW will reject native images. There are some functions used to create a process, but all of them will end up calling CreateProcessInternalW that is the last documented function called in the process creation no matter which one was called first.

All processes created in the system have a structure called EPROCESS (Executive Process) and is maintained in the system address space (memory space reserved for the Windows system). And as we will speak later in this post also maintains a structure in the user address space called PEB (Process Environment Block). So just have in mind that Windows maintain a structure for everything that it creates. In these structures always have a reference in memory for each information it keeps. So when the "process" is executing in user-mode (user address space) it uses the PEB as a reference to access the objects allocated in memory. When the execution is in kernel-mode (kernel address space) and needs to access some information about processes it uses the EPROCESS. However even being the main representations of processes's structures in each side of the operating system (kernel-mode and user-mode) there are more structures created by another processes, Windows Internals cover this thoroughly.

So as mentioned before the process isn't the program itself, thus when a process is created a thread is created with it to execute the main program function. It's important to know this difference, thread execute the program and is the thread that have all CPU context to execute.

Threads

Thread is basically piece of primitive code and a structure in the process block that contains the processor context information. So what runs the program is the thread. Properties of the Thread:

The Processor Context

State of CPU registers, the information of the program state is contained in the thread context.

Stack information

One stack to the kernel-mode and user-mode.

Thread ID

Unique identifier of the thread.

Since we have a lot of programs running in our machine the processor must know the state of each program running simultaneously. Basically when the Windows is switching between the threads it replaces the processor context of each thread, this way the system never lost the state of the program since it's last ran.

Thread is a program that can run in user-mode and kernel-mode it must have a stack for each one. As the process have the PEB structure, thread have the TEB structure both of them are in the user-mode. The system maintains structures both in user-mode and kernel-mode, so in this way the user-mode components that wants to grab information about processes or threads doesn't need to make any system call.

As the process, the thread have an executive thread (ETHREAD) structure to represent the thread object in the system address space for Kernel usage. There are more structures maintained by the system for the same thread and process, for example since Windows Architecture can support more than it's own subsystem (Windows Subsystem - Csrss.exe) it creates the CSR_THREAD and also creates the CSR_PROCESS. These structures are maintained by the Windows Subsystem for internal purposes.

The scheduler and dispatcher are the components in the Windows to switch threads. They are responsible to load the thread context to continue or starts to execution of a program. Basically threads can be interrupted or they just gives up of the processor when they finish execution. When a thread becomes idle, then the components mentioned before put this thread in wait and switch to another thread. Every thread have a time-limit (quantum) to run, when it reaches this time-limit the components put the thread in wait and switch to another thread. This process is continuous. The scheduling decisions are made based in the dispatcher-database, is group of structures maintained by processor with all thread's information like priority, state and so on.

So basically processes are containers and threads run programs. Threads run programs that need resources from processes. Just to make it very clear. So when you debug any "process" you must to know actually which thread you are executing/debugging.

Jobs

Jobs are an extension of the process model. Basically it can manage and manipulate one or multiple processes as a unit. So working with a Job is work with all processes linked to it. There is a plenty system mechanisms that the Job can manipulate and various process-related, I/O-related and CPU-related limits that can be manipulated by the job. To a more thorough look on Jobs subject, please search for it in Windows Internals.

The PEB

All processes loaded in the windows have a Executive Process Structure (EPROCESS) that points and maintain many structures relative to it's execution in the system. Here we will just talk about the PEB (Process Environment Block). The PEB is created by the kernel through MmCreatePeb function after setting up the EPROCESS structure. As we already discussed before, this structure is useful on sharing information about the process without transiting between user-mode and kernel-mode. So the main structure is the EPROCESS (Executive Process) that is located in System Address Space (Kernel) and has a pointer (EPROCESS_ADDRESS+0x01B0 *Peb) to PEB that is located in User Address Space (User-Mode).

The PEB is located in the memory segment fs in 32bits architecture and gs in the 64bits architecture at offset 0x30. Actually in one of these memory segments is located the TEB (Thread Environment Block) structure that has the member ProcessEnvironmentBlock (TEB_ADDRESS+0x30) with a pointer to the PEB. Another way to get the PEB address is only with semi-documented or undocumented functions. To get a better view of the structure access the link below (OpenRCE - PDF). Note that it isn't exactly the same as the up-to-date windows version, but still fits to visualize the internal structure.

Windows Memory Layout, User-Kernel Address Spaces

Would be very expensive to access process information through system calls (system address space), PEB came to enhance performance making process information available in the user address space. Some components of the windows that resides in the user address space need to access information about the process, like the image loader, heap manager and some other components.

PEB structure has a lot of member-fields that you can check on the link I posted above from OpenRCE. In this post I will write about two of them:

Loader Database (PEB_LDR_DATA)
Process Heap

Loader Database / PEB_LDR_DATA

In the previous post we discussed about the PE File and I assume that you have looked at least a little in the reference I gave. So this structure is represented by the PEB's member Ldr. Inside of this field/member Ldr (PEB_LDR_DATA) we have three linked lists that works linking the modules loaded in the process, almost like an array. The linked lists are InLoadOrderModuleList, InMemoryOrderModuleList, InInitializationOrderModuleList. They are ordered differently from each other and like the name of each one suggests, loaded order, memory location order, initialization order.

Each loaded module is represented as another structure LDR_MODULE, storing the information about the module loaded and the rest of linked list. So any PE file that this process needs to use and has loaded in memory is located in this structure. Let's see how the LDR_MODULE structure looks like:

LDR_MODULE Structure

As you can see the three first members of structure is the linked list and after it there are information about the module itself like, BaseAddress, BaseDllName and so on. So you wanna know the modules loaded by some process ? PEB_LDR_DATA is the way.

Process Heap

In the PEB structure there are two undocumented pointer to heaps. In the offset PEB_ADDRESS+0x18, the ProcessHeap member has the address of the first created heap and the PEB_ADDRESS+0x90, the ProcessHeaps member is a pointer to an array of all heaps created. When the process is created by a debugger in first heap there are two fields, ForceFlags and Flags that are set to tell the kernel if the process was created by a debugger or not. Some debugger like x64dbg already have the option "Hide Debugger" to unset these fields.

In the reference below I listed some good resource to analyze. The offset relative to PEB_ADDRESS can variate due the Windows Version. There is a good repo in github from al-khaser there are many anti-"everything" techniques including ForceFlags and Flags which I said here. Check it out. :)

References

Google
Windows Internals - Book
EPROCESS/ETHREAD
PEB_LDR_DATA / Microsoft
PEB_LDR_DATA / Undocumented
LDR_MODULE / Undocumented
Windows Memory Layout, User-Kernel Address Spaces
Doubly Linked List
ProcessHeap
Anti-Debugging / Anti-Emulation
Al-khaser Github

Wednesday, October 18, 2017

PE - Portable Executable File

Hello, folks!

I'm here again bringing an overview about PE files. I think it's unproductive to write an extensive technical post about PE files since there is a lot of information about that in the web. I linked in the end of the post very good references about it.

So the idea here is just to give a general idea of the PE file, then you will be able to do your own researches, you can use the reference as a start. I think it's more easy to learn when you can grab a general idea then start to digging more into each topic.

Anything that I wrote here is in the reference, so you can use a reference to a more in depth reading. Any questions you are free to ask me I would be glad to help if a can.

Tools used in this post:

PE File Format

Let's start thinking about the PE file with this poor analogy. PE file it's some kind of recipe with all the ingredients inside of it. So you have all steps to execute the recipe with all ingredients included in it or with a reference to it.

This format is inherited from the COFF (Common Object File Format) format that came along with VAX/VMS architecture, since Microsoft came from Digital Equipment Corporations that used COFF format file. These formats serves as base for the loaders to read an executable on the system. So to quickly migrate to Windows NT, the developers maintained the original format and enhanced it to PE (Portable Executable).

The PE format is used on Windows to execute programs and is the standard format, i.e. way of organize the data inside a file that make possible to all flavors of Windows to read it, load it and execute it. There aren't almost no difference between 32-bit and 64-bit PE files, the difference resides mostly on field's size.

DLL and EXE files uses the same PE Format and differs just in some values of some fields, mainly in the "File Header->Characteristics". DLL is basically the same for OCX and even CPL files. Once you know the structure of the PE File you know how the executable is disposed on the memory when it's executed, therefore the loader will decide which parts of the file on disk will be mapped into the memory. Let's see a little overview on the pic below:

PE File - File on Disk and In Memory

All the information the loader will be map into the memory will be in the PE file itself. And all the information about how to translate the offset of the file on disk to the file mapped into the memory will be accessible in the file. Next is a SVG image from Wikipedia has a nice view of the PE:

PE - Portable Executable Structure

When the PE file is loaded into memory it is known as module as all the other PE files is imported by it. The beginning address of a PE file is know as HMODULE as is referenced in Microsoft's API. Differently when the file is on disk, in memory we have the concept of virtual memory, i.e. we don't access the real physical memory of our computer, the OS creates a virtual memory space to allocate everything, it acts mapping/translating the virtual addresses to the real physical memory. The OS then can control better the memory management and security. Some regions of virtual memory space are protected by the Windows Memory Manager (Windows Component) that is specified in the section header of the PE format to read-only, read/write and execute.

MS-DOS Header

Came in handy in the first version of windows, because windows machines isn't so common like in nowadays. So the executable could at least print some messaging asking for Windows to run it. This header and in the executables in general always starts with e_magic field or IMAGE_DOS_SIGNATURE, it's important to remeber this. The most important field here is the e_lfanew that have the offset to NT Header where all the useful information resides.

PE Sections

PE file sections are used to split the data in the file. Some sections represents code and other data. There is some kinds of data like, spaces to read and write information, API import, function export, resources and so on. Every section in the PE file specifies what is in it. Commonly PE file has two type of section, code and data.

Windows Loader grab the information on the section header to properly load the section in memory. There is a code section and other data sections. Each section has it's own attributes like which type of data and if this section is read only or read/write in memory, all specified on the field Characteristics in the Section Header. In some cases the section can be shared between process if it specified.

Section names is just a way to better identify what is within, for the operating system it doesn't matter the name itself just the field Characteristics that is indicating the type of the section.

It's important to remember that since the operating system uses virtual memory protection the Optional Header->SectionAlignment value (space between sections) in memory would be different from the file on disk. In disk the default value is multiple of 200h (hex) (so the offset in disk would be like 200h, 400h, 600h...), but in memory the loader maps the sections in a way that each section starts at the beginning of a memory page (which inherit the security flag read-only or read/write specified in the section header). Windows 32-bit has a page memory size of 4Kb and 64-bit 8kb. So for each architecture this would be the alignment of the section, anyway you can always check in the field in the Optional Header->FileAlignment.

Relative virtual address (RVA)

The RVA it's an important piece in the PE file. It's used to located objects after the file is loaded in memory. When a PE file is loaded in memory it starts at some determined address that we call ImageBase (this address will be the HMODULE) it's located in the OpitionalHeader->ImageBase. To simplify everything let's see in the CFF Explorer (PE Viewer) how they are expressed:

PE File - Section Header / Virtual Address

So to locate the .text section in-memory we have to use the VirtualAddress that is the Relative Virtual Address of the section in-memory, it's relative because the final address to the section depends on the ImageBase address. So to locate any section in-memory we need to add: ImageBase + RVA. If we have the ImageBase 0x400000 and the .text section RVA 0x1000, the final address would be 0x401000, is where the .text section starts in-memory.

So in the header we have both the RVA(VirtualAddress) and the Offset(RawAddress) in disk. If we don't map the PE file on the memory we will use only the Raw Address, if the PE file is loaded we will use only the ImageBase+VirtualAddress.

Data Directories

Data directories are data structure used keep information that the PE files need. For example the imports section have a data structure that contains all the information necessary to the Windows Loader when loading the PE file in memory. So it can load the imports before it starts to execute the code.

Examples of data directories is imports, export and resources. So in the PE file we have a header to located each one of these structures. In the nex image is the Header of the Data Directories. here you have the RVA for each Directory in memory and it's size.

PE File - Data Directories

Importing Functions

In the PE file we have an Directory containing all the information about the imported functions. Which functions from which DLLs, then the Loader can load and locate all the symbols it need to run the module.

When using imported functions from DLLs the compiler automatically compiles and generates the PE file specifying in the import section which DLLs is been used inside the file, so the Windows Loader can properly load the DLL and prepare it to be used by the file in run-time. Note that in my source-code I didn't imported all these functions, but the compiler did. I used the GCC and as you can see it imported lot of functions for internal purposes like security and so on.

PE File - Import Directory

The PE file keep an array of data structures with all DLL's imported. Each data structures have two arrays known as Import Address Table (IAT) and Import Name Table (INT). In the previous image we can see that each of these data structures have the name of the DLL (ModuleName) along with two arrays OFT (OriginalFirstThunk / INT) and FT (FirstThunk / IAT).

The tricky part here is that both arrays has the "same" structure, because the structure itself is an union that could be any of the values defined in the structure. I recommend you to read the references to get a more comprehensive understanding about it, take time reading and exploring the PE file. Though the tricky part, in general, the FirstThunk field generally points to the IAT array that is overwritten by the Windows Loader with all the API function addresses and OriginalFirstThunk is an array with 2 fields Hint and Name. The Hint field it's the name of the imported function and hint is the ordinal of the function API might be.

PE File - Import Descriptor

Once the Windows Loader loaded the DLLs and overwritten the Import Address Table (IAT) with all addresses that the PE file need to import, all the calls to imported symbols (function API) is redirected to the IAT and finally to the real API address.

In the run-time if the call to the imported symbol is redirected to a JMP instruction, then it's accessing the IAT before reach the API. If the call doesn't passes through any JMP then probably it's going directly to the API.

Malloc CALL (IDA View)

IAT Jumping To Imported Malloc (IDA View)

In future post I will go in more details here, doing a manual DLL hijacking overwriting IAT. Stay tuned. :)

Exporting Functions

Exports is another data directory containing all the information about everything the PE file exports. We refer to this exports as "symbols", for example the API LoadLibraryA is an export symbol of kernel32.dll. This directory is a little tricky, because it have some confusing pointers and rules, I will try to keep it simple and objective, but for a in-depth information please check the references.

When exporting functions or data to others modules all the information must be in the Export Directory, because it's this information the Windows Loader searches for when the other module is importing the symbols in this PE file. Symbols it's a term that includes anything that could be exported. Generally when some module exports symbols, the name of these symbols is the same as was originally coded on the source file. Let's have a look inside the export directory:

PE File - Export Directory

When we are consuming some DLL and we need to import it's function, generally we call the GetProcAddress to give us the address to that function. When we do that, internally the Windows Loader goes into the array Export Name Table (ENT / Field AddressOfNames) gets the index of this function in the array and then access the same index in the array pointed by field AddressOfNameOrdinals, the Loader saves the ordinal in the array at the index before mentioned (ENT). The ordinal is actually the real index used to get the RVA (Relative Virtual Address) of the imported function. In the field AddressOfFunctions has an array of all the exported functions each index of the array is a RVA that points to the function. The tricky part is the field Base that is used with the ordinal, so to find the real index we need to add the field Base+Ordinal resulting in the index for the AddressOfFunctions array. Generally this field is 0x00000001 and all symbol is in order.

I think this part is the most important of all, then I will make it more detailed debugging the GetProcAddress. I think it is interesting to see how the things works. GetProcAddress is a API imported from kernel32.dll subsystem that uses the kernelbase.dll that uses native API ntdll.dll (undocumented).

If you want to try it I will let the code in my github so you can compile the DLL and the code that consumes it.

Github - PE Portable File

Look for Dll Consume in README

Remember at this point to be objective, follow the address that matter to you. I used the x64dbg is a very good debugger has a lot of functionality and has a great community developing it. After the LoadLibraryA our dll is is already loaded in memory, you can see it in the tab Memory Map (inside x64dbg). In my case it was loaded in the address 0x6C300000. So this is the address that matter to us. Let's debug it. I breakpoint the GetProcAddress:

Breakpoint - GetProcAddress

After the breakpoint I steped into until I found the begning of the process where the "Windows Loader" begins to search the "NONAME" symbol in the Export Directory in our DLL. I will not make this part too long so I will get right to the point. Debugging you can see that it received the BaseAddress of our DLL, then it got the NT Header, checked if it is a valid PE (OptionalHeader->Magic value), got the Export Directory RVA and Export Directory Size, and now the ntdll.dll is inside our export directory. As you can see in the next image EAX already have the AddressOfNames(0x6C30602C), then it start to compare if the name provided in the GetProcAddress is the same as the exported from the DLL.

EAX=AddressOfNames // ECX=PointerTo_NONAME_Function // EDX=NameToCompare

After the confirmation that is the same function and it's the index 0, because was the first function of the AddressOfNames. In the next image we will see that now it got the value in the index 0 of the field array AddressOfNameOrdinals and with this value it was able to sought the function address in the array field AddressOfFunctions.

Getting the AddressOfNameOrdinals

Getting the NONAME symbol Address

Well, I hope that I could be clear engough explaining all the process to you and introducing you to the Windows Loader, basics of the hierarchy process of subsystem and native API, and how important is the PE format to the Operating System. Any question just let me know, you can email me or call me on twitter.

Some PE files use only the ordinal value of the symbol to export it's symbols. Ordinal it's just an index of the symbol as a mentioned. So when some module try to import some symbol by ordinal in the Import Section will be specified which ordinal the Loader must search in the Export Section of the module imported.

Resource files

The resources files is another data directory in the PE file. Generally have it's own section (.rsrc), but it is not a rule. As I mentioned before the PE file have all ingredients included in the recipe, so anything the PE file needs it can include in itself. Any type of file can be included in the PE file, after all any kind of files are just binary.

The resources are just embed files. There is some ways to get this resources in the run-time, advanced ways and basic ways. Generally we use the Windows API to load the resources. I pretend to introduce to this methods in the future, for now let's see how it works.

PE File - Resource Directory

The resource directory it's a little confusing if you have to read these structures, but using a PE reader it's very easy. Works as a chain of structures, the main structure is the IMAGE_RESOURCE_DIRECTORY that contains some fields. There are only two important fields NumberOfNamedEntries and NumberOfIdEntries, these two fields values has the size of the array of the next structure, IMAGE_RESOURCE_DIRECTORY_ENTRY.

The IMAGE_RESOURCE_DIRECTORY_ENTRY structure has two fields, Name and OffsetToData. Now come the tricky part. If the most significant bit of the field Name is set (differ from 0), then the remaining bits is the offset to the name of the resource, if it's not set then it's a ID for the resource. If the most significant bit of the field OffsetToData is set, then the remain bits is the offset to another IMAGE_RESOURCE_DIRECTORY. If this field is not set, then the offset points to the resource itself. It's important to remember that this offset is always relative to the beginning of the resource section.

In malware analysis it's a good place to keep additional parts of the malware to drop (droppers) on the machine. Generally malware do this trying to bypass antivirus alert, because lots of antivirus if not all of them do a heuristics analysis searching for malicious behavior. So in many cases the malware (PE file) in the resources are packed, when the main program is executed it unpack the file from resource section and drops it in some folder. To help us we have some tools like CFF Explorer and Resource Hacker.

To make it more clear let's compile a program with another executable inside of it. I will let the link to the source used here. I compiled everything with GCC. Follow the link:

Github - PE Portable File

Look for Resource Files in README

Well as you can see I used an icon and another source, follow the steps in the github to compile the first source inversing.c that will be inserted inside our main source blog_resource.c.

After you compiled the inversing.c move the executable to the blog_resource.c folder. In the res.rc you can find the files names to compile, I used "inversing.exe" you can use whataver you want, just change the name in the .rc file. So you first have to compile the res.rc to a object file, for this we use the tool windres from GCC. Read the README that will have all commands to compile, any questions please contact me.

After you have compiled the blog_resources.exe you can execute it to see that the code gets the first two bytes from the inversing.exe file that is {"MZ"} or (big-endian{0x4D, 0x5A}, little-endian{0x5A, 0x4D}), these two bytes represents the IMAGE_DOS_SIGNATURE of all executable files. We can see in the Resource Hacker.

Resource Hacker - inversing.exe inside blog_resources.exe

It's confusing, but you can read more about it on the reference. In future posts I want to bring a more detailed post on how a dropper could work using Windows API and without any API (it's possible too). Anyway in the reference you can read more about the functions I used to find the resource file.

.NET Header

This header is present in PE files compiled in the .NET Framework (obviously). This section is needed for specific information about .NET compilation such as metadata and intermediate language (IL). Differently from directly assembly compiled languages like C/C++, .NET has it's entry point on MSCOREE.DLL which is the DLL that will starts the execution based on the information from the .NET header.

To make a little clear in the below image you can see a little overview on the .NET execution flow:

.NET Framework - Book Eldad Eilam - Reversing: Secrets of Reverse Engineering

The assembly part in this case is from the .NET Framework, the application itself is the Intermediate Language (IL) interpreted by the Common Language Runtime (CLR). In this case the disassembler tool utilized is for the IL.

Conclusion

First of all I do apologizes any mistake that I've been made and I would appreciate that if I did any, please contact me. Any question you may have I would be glad to help if I can. Well, as I already said I tried to keep it simple and objective. I hope this can be useful for anyone. This post is one of the series of introduction topics to my future posts with more technical text.

I linked all references in the bottom. So as I already said, the response to any questions that maybe arises certainly is in the references, anyway feel free to contact me. My post is simple and objective to give a direction about the PE Files, for more in-depth information the references is the way. Thanks! :)

References

Matt Pietrek - Peering Inside the PE: A Tour of the Win32 Portable Executable File Format
Matt Pietrek - In-Depth Look into the Win32 Portable Executable File Format
PE Format Overview (Wikipedia)
MSDN - PE Format
Windows Executable Files
Eldad Eilam - Reversing: Secrets of Reverse Engineering
Setting ICONS for Windows programs with GCC
FindResource Reference
LoadResource Reference
Resource Types

Friday, September 1, 2017

Solving the first exercise

Hello, there!

In this post we will reverse engineering the exercise-program that I left in my previous post about compiling C programs. I tried to make well explained and objective. So any question let me know.

Prerequisites:

Writing and compiling a program in C

In the end of the post linked above there is a code that we are going to use in this post. Compile it and let's begin. Tools in this post:

In my last post I left an exercise (simple one) that the goal was to reach the "Secret Stuff!". In this post I will show two ways to do it. It's very simple program coded in C. I assume that already have compiled the program as the link in the prerequisite.

First I ran it to see what happens:

Fig1 - Running the program

Nothing much, just a simple "Not Valid". Well I will run the strings.exe from SysInternals to see if we can find something interesting. I configured my strings program in the Path environment variable to be easy to execute. Take some time and look closely to anything that could be interesting. There are a lot of strings here, but the 90% is standard, besides this doesn't seem encrypted. Taking a time rolling up and down, I could find some interesting strings:

Fig2 - Strings the program

Well I see that I will have to type some password, but when I first ran it nothing was asked to me. Why this ?. And we have the secret stuff too, so we just began with it since our goal is to reach that part, we deal with password in the way to it if we had too.

We could open in some software like CFF Explorer to dig into the imports and so on, but doesn't seem necessary at this point. So let's open it in the IDA Pro and see if we can find anything interesting. With the program opened in the IDA Pro let's seek for our goal in the tab Strings, click in the tab and search for the "You reach the secret stuff!!", then double click in the line with the string. IDA sent us to the part of the section rdata in the offset that starts the string. We have to click in the offset name aYouReachTheSec and press the X key, then we can choose the references for this offset, since we just have one let's double click it. We were drove to a function that uses this string sub_401517, we can click on it's name and press N to rename it, I renamed to SecretSub. Ok, this would be enough to crack it. I will show two manners to do it, cracking it(Patching) and bypassing it. To bypass we just have to find how to reach the secret stuff without hardcode/patch the binary.

Fig3 - String offset in the section rdata

Bypassing the binary

To bypass the binary first we have to find how the binary reach this SecretSub in normal way, so click in the name of the sub and press X to find the references to this sub. We only find one reference:

Fig4 - Main flow

As we can see we have a little piece of the program that asks for some password using the scanf (C API) to get the password, call sub_401460 and right after uses a cmp eax, 1. We can deduce that we get the password and then sub_401460 will check for the password to see if it is right and return to eax if the password is right or not. Let's see how the binary validates the password double clicking in the sub_401460. We were drove into the sub and I already rename the sub to PasswordCheck making it easy to identify.

Fig5 - Password checking

We can see some loops, some ifs and in the end of the sub we have only one piece of code that will drive us to the good end (mov eax, 1), the other pieces just print "Not Valid" in the screen. The first part is interesting, many hex values stock in variables at the first piece of code, we will look at it after. IDA name the variables for us to make our work more easy, this variables are just offsets in the stack for example [ebp+var_c] without the IDA would be [ebp-C], so IDA put variable names in the offset to make our lives easier, we can even rename the variables clinking in their names and pressing N to make it more easy to read the code.

After looking a little at this code we are able to identify that the first part seems like a counter adding in the variable var_10 to be compared in cmp [ebp+var_10],14h , the size in this variable has to be lower than 14h (decimal 20) if it was larger then we got a "Not valid" message.

We can deduce to this point that the password must be under 20 characters. In the next block it uses the variables var_c (counter) and var_1B (string reference), the var_c is an index reference to the string array for both the argument arg_0 (types password) and the var_1B (hardcoded password). It compares byte per byte from both variables to check if the password matches. So we can take the var_1B till var_11 to get the expected password. If we get the whole and put it together we can convert hexadecimal to unicode characters, but we have to keep in mind that since the value is in the stack it is disposed in little-endian order, thus to convert it we must write them in the reverse order like 3332312D3132332D323331 converting this we get "321-123-231". Now we have the password, but if we run it we can't get in the part to type the password, so we have to dig a little more. Let's back a little and see what it have to does to reach the part that asks for the password.

As we can see in the Fig4 there is a cmp with the arg_0 in the main function and right after a conditional jump JG, since this program was written in C we know that in the main function the first argument by default is the amount of arguments passed to the program to execute. If you ran the program in the CMD the first argument will be always the complete path to the program that is been executed. Therefore we must have more than one argument when execute this program, to do this we just add any text after the program name like ">main.exe newargument" if you execute like this we can bypass this part and reach the password part. Let's try it:

Fig6 - Program bypassed

Voilá! We did it, we bypassed the exercise without patch anything.

Patching the binary

Now let's do a quick and functional patch to get this working without type anything. We want to reach the secret stuff without efforts, execute the program and reach it. How can we do it, now we know the address that calls the SecretSub, we just have to alter that conditional jump JG that was made right after the comparative in the program's argument and JMP directly to the call to the SecretSub. We can do this using the x64dbg. Open the program in the debugger reach the address using the CTRL+G in the address 40153E and press space bar to patch this line. To work properly we can't simple call or jump inside the secret stuff because the program must return to properly exits the execution, then we simply jump directly to the right call using a unconditional jump jmp 0x40158E. Press CTRL+P to open the patches click to patch file and choose a name to the new executable and save it. Now try to run it.

Fig7 - Patched program

Voilá! Now we access the secret stuff without have to type anything.

Well that is it folks! This program was quite simple with no anti-technique at all. In the next posts I will try to bring more difficult exercises. :) Thank you for you time and see you soon! Any questions at all please let me know! :)

Thanks! Best regards!

Wednesday, July 19, 2017

ToolKit Swiss - Chrome Extension

Hello there!

Recently I have been working on a chrome extension that helps me on internet researching and navigation. So I coded in a way that makes easy to write new tools and I have only spend time on the tool itself, no needing to write the whole extension again for new tools that I may need.

The extension is still in developing, but only do it in my free time. I think it can be useful for someone that want the same functionality or want to code chrome extensions, there is a plenty of examples that can be useful.

This post I tried to cover the steps that I use to code a new tool, showing how my "ToolKit Swiss" works. I posted the version 0.1 that was pretty rustic and "boilerplating". The version 0.2 it is more easier to implement and have new functionalities. Soon as I work on it will be better and more flexible. If you have any ideas you can contact me I would like to help if needed.

Well, let's do it!

Design Overview

So in this diagram you can see how the classes interact with each other. ToolKit runs on background intercepting all actions with the context menus created and the rest is injected on page. ToolKit provide all the resource the tool need. Each tool have it's own JSON with the html creation logic that is provided to the FrontEnd that passes it to the ElementBuilder to build the html.

In the next example I will try to demonstrate the flowchart of the ToolKit Swiss. From the start to the tool execution. The tool inside the ToolKit has freedom with all DOM elements since it was injected on the page.

By the way the source is commented and I will export to better JSDOC documentation soon as I could. I will try to discuss about it in general through the new tool creation.

Some parts of the tool creation is still hard coded in some classes. My goal is to make it more flexible soon as I can, until there it's needed to hard code inside the class.

Base64 Encode-Decode

I will discuss about the hard coded part first, where it's needed to add code for make it functional on the ToolKit Swiss.

First of all lets make the hard coded part on the classes. First we need the ID for the new tool let's add it on the types.js:

https://github.com/bernardopadua/toolkit-swiss/blob/master/js/types.js#L33

Now we have to create the Context menu at toolkit.js toolContextCreate_:

https://github.com/bernardopadua/toolkit-swiss/blob/master/toolkit.js#L87

And the function event to handle the context click:

https://github.com/bernardopadua/toolkit-swiss/blob/master/toolkit.js#L314

Well, that's it. Now let's to the tool itself. First I begin with the element JSON (html logic). There are elements that need to be fixed on the JSON elements of the tool, I will improve it soon as I can.

Let's copy the template.json and alter to our purpose since the layout is what I need. Important thing is that all components used in the tool construction must be at class property of the component-html tag.

https://github.com/bernardopadua/toolkit-swiss/blob/master/elements/base64encodedecode.json

It's important to change the "toolkit-block-base64" (Line 12) for each new tool. I forget to comment in this line on the template.json.

To start a tool the code we need is:

https://github.com/bernardopadua/toolkit-swiss/blob/master/js/tools/base64encodedecode.js

I will just pass through the key parts of the Tool. Since I fully commented the Tool class you can have a very good base of what is capable of. And all my tools are commented as well. Any question just send it to me I would be glad to help.

Any doubts just see at the tool.js. Everything is there.

Line 11:

init must be set in every tool. FrontEnd call this constructor when it loads the tool;

Lines 14-15:

This property is mandatory for internal purposes;

Line 26:

Like a said before, your components must be in the class property of the html TAG;

Line 31:

To have this functionally you must set a DIV on the JSON element with class 'error-raise';

Line 43:

Every new tool must to do this to inherit the properties of the Tool;

Line 50-51:

This is mandatory tool. ctxFront is a instance of FrontEnd;
initTool_ initializes the tool on FrontEnd for interaction;
getElements_ make the FrontEnd call for the elements of the tool invoked;

For a better view of all functionalities I recommend to see the translate tool. Translate tool it's more complete over the functionalities available. Anything just contact me.

https://github.com/bernardopadua/toolkit-swiss/blob/master/js/tools/translateit.js

Thanks!

Monday, July 10, 2017

Basicando o Assembly

Introdução ao Assembly

Fala galera, eu aqui de novo trazendo mais tópicos básicos de engenharia reversa, básicos por enquanto, viu ?! Quero trazer algo mais desafiador nos posts futuros, porém deve haver uma introdução primeiro. Desculpem qualquer bobeira que eu disser durante o post e se eu fizer algo por favor fiquem à vontade para me comunicar. Engenharia reversa só pode ser feita com conhecimento em Assembly, não tem outro jeito, ou aprendemos ou aprendemos.

Caso você não tenha lido o meu último post, eu recomendo. Eu tento abordar de uma forma mais dinâmica a computação. Então tudo no fim é assembly como você já deve saber, não importa que linguagem você utiliza ela vira assembly no fim ou no mínimo é interpretada através dela. De qualquer modo é bom saber assembly, fora que não é tão difícil entender assembly ainda mais se você já usou alguma linguagem funcional, como pascal por exemplo.

Assembly não é universal, diferentes assemblers compilam para assembly de diferentes formas e com base na arquitetura utilizada. Afinal alguns processadores interpretam as instruções de forma diferente, assim como é nas linguagens de alto nível. O que muda é a sintaxe, os conceitos permanecem. Nas minhas postagens trarei a arquitetura IA-32 (que é mais comum pelo mundo afora). Lembrando que esse post não é uma aula completa, mas abordarei os conceitos para que possamos iniciar posts mais técnicos.

Como vimos no meu post passado, o computador trabalha com o conceito de Memória-Dados X Instruções. Ele armazena tudo na memória e depois interpreta os dados e instruções de acordo com a necessidade e a forma como comandamos ele. Se você passa determinado para a CPU dizendo que é pra ela executar, ela vai tentar executar. Mais pra frente veremos como isso acontece.

Computador / Memória, CPU, Dispositivos IO(Entrada-Saida) e BUS

O processador possui alguns ponteiros que são utilizados para ajudar a CPU durante toda a sua execução, uns apontam para as instruções e outros para os dados. Basicamente é o que eu já havia dito, faça determinada operação com os determinados dados. Então temos o ponteiro de instrução instruction pointer e o ponteiro de dados data pointer. Nós trabalharemos muito com eles nos posts futuros. O ponteiro de instrução como o nome diz, aponta para a memória que contém as instruções e o ponteiro de dados aponta para a memória que contém os dados, e com essas informações a CPU faz os procedimentos necessários.

Unidades da CPU X Memória

Não se assustem explicarei em mais detalhes. A CPU guarda os valores desses ponteiros em registradores (registers). Conforme o processador executa as instruções o ponteiro de instrução (instruction pointer) aponta para a próxima instrução, assim como o ponteiro de dados. Uma instrução possui entre 1 e 3 bytes e é chamada de opcode (operation code(código de operação)). Então basicamente o assembly possui 3 partes principais, opcodes (operações), data sections(seções de dados) e directives.

Opcode

Código mnemônico, é uma representação mais compreensível dos opcodes. É uma forma mais tranquila para se ler do que o código de máquina. No exemplo abaixo temos do lado esquerdo o código de máquina e do lado direito os mnemônicos. Temos por exemplo o opcode/instrução '89' que é o a instrução MOV. Diferentes "tipos" de assembly representam esses valores de formas diferentes.

OllyDbg / Direito Assembly, Esquerdo OpCodes

Data

As data sections são espaços utilizados para armazenar os dados que as instruções precisam buscar para executar. Então os dados podem estar em alguma seção da memória ou ele pode usar a stack (mais depois). Todos os dados são armazenados utilizando sua representação em hexadecimal, e são referenciados para serem utilizados por seus endereços em memória. Tudo que é armazenado na memória possui um endereço. Veremos que os dados ou endereços podem ser requisitados através de uma immediate constant que está explícito diretamente na linha de execução ou pode estar em alguma parte da memória, isso inclui o stack frame.

Directives

Directives são elementos utilizados em assembly para informar ao assembler (compilador do assembly) como esse determinado dado deve ser utilizado. Por exemplo se você precisa armazenar um número float o assembler precisa saber que tipo de dado é para armazená-lo de forma correta. Ou quando você quer utilizar um dado que está em um determinado endereço de memória, por exemplo. As directives criam seções na memória para armazenar os tipos de dados. Uma das directives mais importantes é a ".section" que cria seções em memória para os determinados tipos de dados.

Sections

Nós temos vários tipos de seções, podemos até criar outras se quisermos. Contudo as seções listadas abaixo são padrão:

.text:

Todos as instruções que o processador executará são armazenadas nessa seção. Dados não são permitidos aqui, a não ser os immediate constants que como eu disse antes são representados diretamente na linha de instrução. Como uma atribuição de valor a = 5, mas isso depende do programador assembly ou do compilador nas linguagens de alto-nível.

.data:

Esta seção é responsável por armazenar todos os dados que serão utilizados pela seção .text. Serão referenciados através de endereços de memória. Como program.ADDRESS, por exemplo.

.bss:

Esta seção é utilizada geralmente para dados não inicializados, que são armazenados durante a execução do programa. Acredito que o nome da seção mude de acordo com as linguagens.

Arquitetura IA-32

A arquitetura IA-32 foi desenvolvida para os processadores pentium pela Intel. Não tenho certeza se é realmente a mais utilizada atualmente, contudo é MUITO conhecida, possui bastante documentação e bem utilizada. Quando estamos aprendendo assembly a arquitetura é só a sintaxe de como escrever, para migrar para outras arquiteturas fica mais fácil depois que aprendemos uma bem, pois o conceito é sempre o mesmo o que muda é como escrever. Algumas arquiteturas possuem mais instruções ou instruções mais elaboradas e outras menos. A arquitetura IA-32 é dividida basicamente em 4 partes:

Control unit (Unidade de controle):

A unidade de controle é responsável por trazer da memória os dados e as instruções. Ela então traduz essas instruções em micro-operações e passa para unidade de execução. Após a execução o resultado retorna para a unidade de controle que armazena o resultado.

Execution unit (Unidade de execução):

Executa todas as micro-operações e retorna o resultado.

Registers (Registradores):

Os registradores são responsáveis por armazenar dados e endereços de memória. Esses registradores estão na memória interna do processador. O processador mantém os dados na memória interna para deixar a sua execução mais rápida, pois o processo de ir buscar na memória RAM do computador é muito mais lenta.
Para manter o post objetivo não escrevei extensamente sobre todos os registradores, contudo estou deixando as referências no final do post com um conteúdo mais acurado. Basicamente há 4 tipos de registradores (Tem mais, veja nas referências):

Flags:

Flags são utilizadas para manter controle das operações executadas pelo processador. Através dela sabemos se as operações funcionaram ou não. Certas flags são marcadas de acordo com a operação executada. Veremos melhor sobre elas.

Registers

General Purposes

Esse tipo de registradores são utilizados principalmente para trabalhar com os dados que as instruções/operações usam. Esses registradores possuem um tamanho de 32-bits, porém são subdivididos em 16-bits e 8-bits.

EAX (32-bits):
- AH (16-bits):
  - AL (8-bits)
EBX (32-bits):
- BH (16-bits):
  - BL (8-bits)
ECX (32-bits):
- CH (16-bits):
  - CL (8-bits)
EDX (32-bits):
- DH (16-bits):
  - DL (8-bits)
EDI (32-bits):
- DI (16-bits)
ESI (32-bits):
- SI (16-bits)
EBP (32-bits):
- BP (16-bits)
ESP (32-bits):
- SP (16-bits)

OllyDbg / Registradores e Flags

Esses são registradores que trabalharemos na maior parte do tempo. É importante ressaltar que modificar um registrador da cadeia mais alta modifica também a cadeia mais baixa. Por exemplo se armazenarmos um valor em AL e então colocarmos outro valor em EAX o valor que havíamos colocado em AL foi modificado pelo novo valor colocado em EAX já que AL é EAX.

Alguns desses registradores são utilizados de forma padrão, como é o exemplo dos registradores EBP e ESP que são utilizados para controlar o stack frame. O stack frame é um bloco de memória utilizado para controlar o contexto de valores e dados de uma determinada function/method, por exemplo quando declaramos uma variável dentro de uma função, esse valor é armazenado na stack frame. Utilizamos a stack também para passar valores via parâmetro na chamada de outras funções, utilizando a instrução push que armazena os valores na stack, veremos melhor na prática no decorrer dos posts.

Como eu sempre digo, é importante programar uma linguagem "baixo-nível" tipo C ou C++ que a curva de conversão para o assembly é menor, além do que você tem uma prática de manuseio diretamente com a memória. Com isso você tem uma visão melhor de como esse código fica em assembly. No decorrer das minhas postagens trarei bastante código C para analisarmos.

Segment

Antigamente podia-se escrever diretamente na memória física, pois o processador permitia esse tipo de ação, esse modo é chamado de real mode. Atualmente o real mode ainda funciona porém de forma limitada. O modo que é utilizado atualmente pelos sistemas operacionais (Windows NT em diante) é o protected mode com o conceito de paginação de memória, dessa forma fica explícito que em segmento de código não pode ser escrito nada, somente em segmento de dados. Pode-se também descrever blocos de memórias com determinados níveis de acesso. O assunto é extenso, vamos com calma ;p

Portanto os segmento de registros são utilizados para identificar onde os dados estão localizados em memória. Cada registro de segmento possui um ponteiro para a section (seção) onde ele irá pegar os dados necessários. Os segmentos são:

CS: Segmento de Código.
DS: Segmento de Dados
SS: Segmento da Stack
ES: Segmento Extra
FS: Segmento Extra
GS: Segmento Extra

Cada um deles é usado em casos específicos. Por exemplo, se você possui um endereço dentro do registrador EAX que é um endereço de memória dentro da seção de dados e você precisa guardar a informação que está em EBX, você verá algo como:

MOV dword ptr ds:[EAX], EBX

Com isso você está movendo a informação que EBX contém para dentro do endereço de memória que está localizado em EAX e dentro do segmento DS (segmento de dados), veja que você não está guardando dentro do registrador EAX, mas sim dentro do endereço que EAX aponta e na seção em que o segmento DS aponta.

OllyDbg / Immediate constant sendo movido para dentro da stack

Veremos muitos exemplos práticos. Que é o melhor meio de aprender!

Flags

As flags são mantidas em um único registrador chamado EFLAGS e cada flag é representada por um 1 bit dentro desse registrador. Como eu disse antes as flags são utilizadas para controlar o sucesso ou a falha das operações/instruções que o processador executa. Por exemplo os JUMPS (saltos) condicionais, eles usam algumas flags como referência para saber se o salto irá ser realizado ou não. Jump como o nome diz é um salto para algum endereço de memória, saltar para outro endereço de memória com outras instruções, ou seja continuar a execução a partir de outro endereço, outro ponto dentro do programa. Flags são divididas em 3 grupos:

Status Flags
Control Flags
System Flags

Vou falar basicamente sobre as status flags. Nas referências no fim do post tem bons livros com mais informações e de forma mais acurada, a ideia aqui é objetividade e macro-informação. As flags são um "sinal" ou status do resultado das operações matemáticas executadas pelo processador. As flags são :

CF: Carry Flag.

Carry flag é usado como suporte para as operações binárias salvando o carry ou o borrow. Utilizado para "carregar" o bit que sobra da operação. (Post introdutório de computação) Geralmente usado em operações unsigned.

PF: Parity Flag.

Quando a soma de 1s do valor resultante da operação é par.

AF: Adjust Flag.

Usada no Binary Coded Decimal (BCD) quando é um borrow ou carry. Como o próprio nome diz flag de ajuste do valor referente ao byte que representa a grandeza dentro do BCD. É um assunto complicadinho, indico leitura na referência (Richard Blum).

ZF: Zero Flag.

Usada quando o resultado de uma operação binária é 0.

SF: Sign Flag.

É usada em operações com números signed, é algo como números sinalizados (positivo ou negativo). Leia o post introdutório que tem um bom conteúdo lá. Esta flag é marcada quando a operação resulta em um número negativo.

OF: Overflow Flag.

Esta flag é utilizada em número signed quando a operação resulta em um número maior do que pode ser armazenada, ou seja deu erro.

O assunto é complicadinho mesmo, mas com a prática a mais pesquisas chegaremos em um entendimento bacana. Qualquer dúvida estamos ai.

Stack

Stack é muito importante se tratando de engenharia reversa ela é utilizada o tempo todo durante a execução de qualquer programa. É utilizada para guardar dados de curto-prazo por assim dizer. Como eu disse antes, quando a execução entra dentro de uma função é criado uma nova stack frame que seria um pedaço ou um bloco dentro da stack reservado para a função que está sendo executada. Stack geralmente é utilizada para:

Salvar os valores dos registradores:

Geralmente é utilizada para guardar valores de um registrador quando este mesmo registrador precisa ser utilizado em outra operação. Depois esse valor pode ser recuperado.

Alocação de variáveis locais:

Como já disse, suas variáveis locais ficam alocadas dentro da stack frame quando a execução entra em determinada função. E quando você precisa utilizar esses valores é usado o segmento que falamos antes, SS, stack segment.

Passar parâmetros para funções:

Para chamar determinada função geralmente é utilizado a instrução PUSH para enviar os valores para a stack e então chamar a função com a instrução CALL. Geralmente esses parâmetros são passados de forma reversa, da direta para a esquerda. f(p1, p2, p3) então passamos PUSH p3,p2,p1 e efetuamos a CALL.

Guarda o endereço de execução após a instrução CALL:

Quando o programa executa uma instrução CALL o curso de execução é alterado, então antes de uma CALL ser efetuada ela salva o endereço de execução que está na linha abaixo da CALL, pois quando a função termina de executar ela precisa retornar pro FLOW de executação na qual ela estava antes, isso acontece depois da instrução RETN ser executada.

Quando o programa entra em uma função as variáveis do escopo dessa função são armazenadas na stack. A stack nada mais é do que um pedaço de memória que o programa utiliza guardando os dados necessários. Para guardar os dados na stack utilizamos a instrução push e para resgatar os valores da stack utilizamos a intrução pop.

Toda vez que a execução do programa entra em uma função um stack frame é configurado. Stack frame nada mais é do que um bloco de dentro da stack. O stack frame é limitado pelos registradores ESP e EBP. ESP aponta para o topo da stack e o EBP para a base da stack. Vejamos um exemplo de código em assembly que realiza o stack frame:

PUSH EBP
MOV EBP,ESP
SUB ESP, SIZE

Primeiro o valor de EBP é salvo na própria stack para quando sairmos da execução da função o antigo stack frame possa ser restaurado. Então o valor que ESP está apontando agora é a nossa nova base do stack frame. Lembrando que o ESP é o topo da stack, então imagine que estamos colocando uma nova pasta de arquivos em cima de uma pasta de arquivos já existente. Quando a execução da função terminar imagina que retiramos essa pasta de arquivos que foi colocada em cima e sobra a pasta que já estava lá. Desta forma vemos que a stack funciona como LIFO (last in, first out) último que entra, é o primeiro que sai. Então por último é utilizado a instrução SUB que faz com que ESP aumente de tamanho, aumentando o tamanho do nosso stack frame.

Geralmente esse SUB é feito quando já se reserva o espaço da variável então somente é necessário acessar esse espaço dentro do stack frame diretamente usando nosso segmento de stack, SS, que vimos anteriormente. Teriamos algo parecido com isso:

mov EAX, PTR SS:[ESP+4]

Nessa última instrução o valor que está dentro de ESP somado com 4 bytes de posição, nos dá o endereço da variável desejada é movido para o registrador EAX. Nesse caso a variável é [ESP+4]. Se tivéssemos mais uma variável poderíamos ter outro endereço para ela como [ESP+8].

Uma das coisas que temos que entender é que a stack cresce para os endereços de memória menores, ou seja ela cresce para "baixo". Quando um programa começa sua execução a stack começa em seus maiores endereços e vai crescendo para os endereços menores. Vamos ao exemplo:

Funcionamento da stack

Heap

A Heap é uma área da memória onde os programas utilizam para alocação dinâmica de memória. O sistema operacional geralmente cuida dessa parte para os programas, então quando eles são iniciados e precisam de uma HEAP por alguma razão, o próprio sistema operacional cria este espaço em memória e entrega um ponteiro para esse espaço ao programa. Quando temos uma constante ou uma variável já iniciada como por exemplo:

char newText[] = "Testing how code storing data.";

O compilador já se encarrega de colocar o valor desta variável na seção de .data que é a seção onde ficam os dados inicializados do programa. Então dentro da execução do programa o compilador já coloca o endereço direto na linha da instrução que está sendo executada, pois o programa já sabe onde armazenou esta informação. Esse endereço podemos chamar de immediate constant dentro do programa ficaria algo como program.ADDRESS, sendo ADDRESS o endereço da variável dentro da seção.

Contudo quando você vai carregar um arquivo externo o programa precisa alocar um espaço para esse arquivo e o tamanho que será necessário é dinâmico, então a HEAP será necessária. Se você abrir um arquivo de 5Kb então uma HEAP com o tamanho necessário será disponibilizada. Se você carregar um arquivo de 1Mb, então uma HEAP com o tamanho necessário será disponibilizada com o tamanho necessário. Heap nada mais é do que uma alocação dinâmica de memória.

Temos que ficar atentos quando se trata de HEAP em engenharia reversa, pois muitas informações importante são registradas nas HEAPs, veremos mais no decorrer dos posts e na prática como funciona.

Conclusão

Bom galera é isso! Quero pedir desculpa de antemão para qualquer erro ou mal-entendido durante a explicação. Estou tentando ser o mais objetivo e didático possível, contudo passar tudo isso em um post só é MUITO complicado, pois há muita informação. Acredito que com esse macro de informação irá facilitar e direcionar as pesquisas em cima do tema, pois sei que não da pra entender tudo só com esse post.

Qualquer dúvida em cima do tema é só me procurar, se eu puder ajudar ajudarei com maior prazer. E se eu não souber, pesquisaremos juntos. O intuito é só aprender. Quem sabe se juntar uma galera legal não montamos um grupo de estudo. Fica a dica. Qualquer coisa "tamo ai". :thumbs_up: 👍

Referências

Google
Eldad Eilam - Reversing: Secrets of Reverse Engineering
Reverse Engineering Code With IDA Pro
Professional Assembly Language - Richard Blum