Recently I expressed being unsure of what someone meant when they indicated a meeting time relative to “EST.” I explained that EST is not the same as EDT, and since I’m not in either of those time zones, I suspected they said EST when they meant EDT. They accused me of being “pedantic.”
They are right: the type of forensic analysis that I am called upon to do relies upon the pedantic behavior of computers. They do exactly what we tell them to do. I (and other forensic analysts) need this high degree of reliable reproducibility to understand when things do not match the expected pattern. One name for this in computer security is “anomaly detection.’
At the present time, as far as I can tell, every region in the eastern time zone is either on “standard time” or “daylight time.” Thus, it is possible to disambiguate when someone uses the wrong designation. For those of you who are confused: EST means Eastern Standard Time and is defined as being UTC (Universal Coordinated Time) – 5 hours. So when it is 11 am UTC, I can quickly compute the time EDT: I subtract 5 hours from it (6 am EST). On the other hand, EDT means Eastern Daylight Time and is defined as being UTC – 4 hours. Thus, 6 am EST is 7 am EDT.
You might think, “pfft, none of this makes any difference since a reasonable person would know what I mean even if I use the wrong timezone.” Except some regions do not change timezones throughout the year (ostensibly to “save daylight.”) For example, some US states don’t switch – Arizona, for example, which uses MST (Mountain Standard Time) year-round, at least for now. At one point, Indiana did require the use of daylight time during the summer. When I don’t live in your timezone, it seems unreasonable for you to demand that I know your timezone rules. Further, when I’m doing forensics, why would I remember that Indiana didn’t require using CDT in the summer from 1953 to 2006? So, yes, I am pedantic about this.
Computers are even more pedantic than I am. When writing software in the late 1970s, I was careful to ensure my code would compute leap years correctly for centuries ahead. These details were (and are) important to me. Working with operating systems has made me more attentive to these details, not less, since the operating system is generally a definitive source of “truth” concerning precisely these types of details. File systems, for example, preserve timestamps. Early file systems had limited space for storing timestamps and might choose not to store high precision times valid for millennia. For example, FAT stores times relative to January 1, 1980. Further, it captures them relative to the time zone of the system where they were captured. FAT has limited granularity for different timestamps as well:
- Creation timestamps can be accurate to 10 milliseconds.
- Modified timestamps are accurate to two seconds.
- Accessed timestamps are accurate to the day.
As storage space has become more plentiful, we have used more extensive and higher-accuracy timestamps. The NTFS file system (Windows) uses 64-bit timestamps counting the number of 100 nanoseconds since January 1, 1601. Ext4 (commonly used by Linux) has nanosecond resolution with dates between December 14, 1901, and May 10, 2446. Of course, high precision does not mean the computer system on which the data was recorded was capable of such precision, nor does it even mean the time on the computer was set correctly.
So, what does this have to do with computer forensics? Although file system timestamps can be faked, it is surprisingly difficult for humans to do so because – unlike computers – humans tend to overlook minor variations. For example, EST versus EDT might not be something humans consider. They might not think that one day was EDT and another day EST. This becomes more interesting when we consider that different software layers might be recording timestamps. What is particularly useful here is that the source of the timestamps is likely the same (e.g., the operating system), and thus, the recorded result should be the same – or at least within some small window of time. If I see timestamps that are off by one hour, I often ask if they have been modified manually.
For example, I have talked about the importance of meta-data in pictures (see “Why Metadata matters “) because both the software used to edit a picture and the file system that stores it independently record the timestamps. Knowing what ordinarily happens when a file is edited means that these timestamps become a form of “smoking gun” when what I see does not match what I should see. Of course, timestamp variance is not definitive evidence of malicious behavior. That is why part of my work is to reproduce the expected behavior in situ but also to consider other possible actions.
Doing this relies upon understanding what should happen and then reasoning about this. Since I have extensive experience in file systems implementation, development, analysis, and debugging, I have good intuition about what should happen. But, of course, I still make sure to confirm my expectations because that strengthens the persuasiveness of my observations and conclusions.
File systems, in particular, can be quite challenging to analyze thoroughly. For example, the FAT file system format is a well-documented, well-understood on-media format. However, the way that format is manipulated can vary dramatically by the implementation. As an expert, I know it is important to consider not only the on-media format but also the implementation of the file system. The FAT file system implementation in Windows is (fortunately) publicly available because Microsoft distributes its source code as an example. The behavior of that implementation is quite different than the implementation of the same media format (FAT) inside a different device, such as a camera.
Why? Well, the FAT file system in Windows comes from a historical background in which it was used as the primary file system. Thus, performance was definitely important, including parallel performance – where multiple application programs were reading from and writing to the file system simultaneously. Therefore, the FAT implementation on Windows has a robust mechanism for allowing more significant parallel activity, minimizing unnecessary I/O operations, etc. For example, FAT on Windows does not zero out storage space when allocated. Instead, internally it tracks the “valid data length” and zeros out assigned but unwritten file regions when the file is closed. This differs from NTFS, which stores the “valid data length” as part of the file meta-data. In either case, the goal is the same: ensure that the data previously stored on the media at that location is not exposed when the media is being re-used. Zero-filling that media is one way of accomplishing this, but it is wasteful because, almost always, an application writes over the newly allocated data region.
On the other hand, a camera does not worry about “multiple simultaneous users” because there’s exactly one thing using the media: the camera. Similarly, the camera doesn’t worry about zeroing out media at any point in time – it knows that it will write out the data as part of its work. Therefore, the camera will have a simple block allocation strategy (because there’s no need for good parallelism.) Instead, the FAT implementation in the camera will be as simple as possible since a camera usually has less memory and processor power than a general-purpose computer.
This behavior difference also works for timestamps. When I edit a picture on the camera, it won’t behave the same. That’s because the reliability model for the camera is much lower than it is for our general-purpose computer. The general-purpose computer has a common paradigm for “editing a file,” and it does not behave at all like what users think happens. The application starts by creating a new copy of the file. Maybe you have seen these files, where they have strange names or similar names with quirky characters (the ~ is a common one that I have seen used.) When you “save’ the file, what happens is that it writes the data out to the new copy of the file, and then it asks the operating system to replace the old file with the new file. On Windows, the file system will note that this replacement is happening and will move attributes from the old file to the new one. This means you will see the “creation time” of the file is preserved. That logic is clearly visible in Microsoft’s FAT file system code. You won’t find that sort of functionality in the camera implementation of FAT – it has nothing to do with the on-media format. It is instead tied to the behavior of the operating system.
So, when I was looking at metadata in the file system and comparing it to metadata in the file itself, I was relying on my understanding that the software used (Adobe Photoshop Elements 3.0) was running on Windows. Thus, I know that this “preserve the creation timestamp” would have been implemented by FAT, but I also know that Adobe has no such mechanism.
Thus, I may be pedantic when I look at timestamps and consider time zones, but I do that because it is essential. Humans have a tough time reproducing the exact behavior of pedantically precise behavior of the Windows system itself. I must understand the distinction between daylight time and standard time because it matters. When I find metadata that does not match the expected pattern, I dig deeper: why is it different? What could cause that sort of anomaly?