I've got an old C++ source-code to parse an input file, and I removed some unrelated parts and reproduced it (logic is the same) as followed:
#define _CRT_SECURE_NO_DEPRECATE
#include <stdio.h>
#include <iostream>
typedef struct
{
unsigned short int id;
unsigned long size;
unsigned long time;
unsigned short int day;
unsigned short int year;
unsigned short int source;
unsigned short int destination;
} header;
#define READ_SIZE 1024 * 2
#define MAX_LENGTH 1024
#define BUFFER_SIZE READ_SIZE + MAX_LENGTH
const int SIZEOF_CHAR = sizeof(unsigned char);
int main()
{
int index = 0;
int offset = 0;
header* hdrPtr;
FILE* fp;
char buff[BUFFER_SIZE];
fp = fopen("data.dat", "rb");
int numRead = fread(&buff[offset], SIZEOF_CHAR, READ_SIZE, fp);
hdrPtr = (header*)(unsigned short*)&buff[index];
return 0;
}
The input data.dat file is a binary file with certain rules of formatting/structure (honestly at this time I'm not sure what those rules are, and my job is to figure them out, then translate into a new C#.NET codebase). To make it easier I tested with some dummy text file, and some random binary files such as a .PDF or .MP4, and still got the results (hdrPtr). However, I'm still unable to understand those results.
For example, I tested with a data.dat text file with the content of:
Hello
World
I got the pointer to a header with these values (which I don't understand how these numbers are resulted):
id 25928 unsigned short
size 1460276591 unsigned long
time 1684828783 unsigned long
day 52428 unsigned short
year 52428 unsigned short
source 52428 unsigned short
destination 52428 unsigned short
The same with other input file data.dat, where it should be a binary. I don't have experience with C++ until last week, and it seems rather weird to me! How come we could cast an address of the first element of the array to, a pointer to an unsighed short, then cast to a pointer to the header struct (!?).
I'm struggling to convert the above code to C#. Any help/hint and recommendation is appreciated!
char, but the reverse is not true. What's happened is the bit pattern read from the file is being viewed as a structure, and since the bit pattern in the file isn't a structure the results are insane.(header*)is an explicit type conversion. You should avoid these most of the time because they tell the compiler to turn off its brain and do exactly what you told it to do. You have to be absolutely certain you are correct and you have to actually be correct because if you aren't, there will be no warning. The compiler produces code that does exactly what you asked for no matter what the outcome will be at runtime. My rule of thumb when I see one of these casts is to assume there's a bug and examine the code more closely.int numRead = fread(&buff[offset], SIZEOF_CHAR, READ_SIZE, fp);would have been a lot safer asheader* hdr; int numRead = fread(&hdr, sizeof(hdr), 1, fp);, but there are still a few gotchas. The file might not contain a valid header and there's no way to check other than to read the header, checknumReadto ensure you read enough bytes, and then sanity-check the values read. The byte order of the numbers could be backward. The size of the integers could be different from what was written. Believe it or not, I've seen 32 bitshort.fread(&hdr, sizeof(hdr), 1, fp);thenheader* hdr;should beheader hdr;insteadheader* hdr; int numRead = fread(&hdr, sizeof(hdr), 1, fp);Should beheader hdr; int numRead = fread(&hdr, sizeof(hdr), 1, fp);I removed theptrfrom the identifier, but neglected to remove the*that made the variable a pointer in the first place.