Introduction
Analysts that perform macOS forensics have had few, if any, artifacts of program execution to rely on during investigations — until now. In macOS 10.13 (High Sierra), Apple introduced CoreAnalytics, which is a system diagnostics mechanism that maintains a record of Mach-O programs that have executed on a system over approximately one month. CoreAnalytics can serve a number of valuable analytical purposes for both insider threat investigations and incident response. The artifact can be used to:- Determine the extent to which a system was in use, with accuracy up to one day
- Determine which programs were run on a particular day, whether in the foreground or in the background
- Determine how long, approximately, a program was running and/or active, as well as provide an approximate number of times the program was launched or brought to the foreground interactively
Summary Analysis
The CoreAnalytics artifact provides a historical and current perspective on program execution, on a near-daily basis. This data is derived from two sources:- Files with the extension
.core_analytics
in/Library/Logs/DiagnosticReports/
that are comprised of JSON records: The first two records can be parsed to reveal the timestamps that the diagnostic period began and ended; the data following those records indicates system and application usage over the diagnostic period. - Files with GUID-like names in
/private/var/db/analyticsd/aggregates/
that are comprised of nested arrays: The subsystems that report to the analytics daemon temporarily stage program execution data in these aggregate files, for the current diagnostic period. The staged data is typically pushed to a.core_analytics
file at the end of the diagnostic period.
.core_analytics
files within the first two lines, which establish the times that the period began and ended. Each diagnostic period ends at the first system sleep or shutdown after 00:00:00 UTC. As mentioned above, data for the day is staged in aggregate files before being submitted to a .core_analytics
file for longer-term storage at the end of the diagnostic period. Consequently, CoreAnalytics cannot be used to determine the exact time that a program was executed, but can be used to determine the time frame (approximately within a 24-hour period) in which the program was run.
In light of this artifact, we have written a Python script (available on our public GitHub) that will parse both CoreAnalytics and aggregates files and write the results to a more easily consumable JSON or CSV. The CoreAnalyticsParser script will:
- Parse and convert the diagnostic period start and end timestamps into UTC and the ISO8601 format for each
.core_analytics
file - Extract the relevant, available fields from each record from each available
.core_analytics
file - Convert raw values that have been determined to indicate a number of seconds to a more readable %H:%M:%S strftime format
- Parse the aggregate file associated with the
comappleosanalyticsappUsage
subsystem into the same fields produced by that subsystem in each.core_analytics
file.
.core_analytics
file:
{ 'message': { 'activations': 105,
'activeTime': 4250,
'activityPeriods': 12,
'appDescription': 'com.google.Chrome ||| 67.0.3396.87 (3396.87)',
'foreground': 'YES',
'idleTimeouts': 4,
'launches': 0,
'powerTime': 12537,
'processName': 'Google Chrome',
'uptime': 26110},
'name': 'comappleosanalyticsappUsage',
'uuid': '4d7c9e4a-8c8c-4971-bce3-09d38d078849'}
Figure 1: CoreAnalytics record example for Google Chrome.
When parsed by CoreAnalyticsParser, the same record will appear as:
{ 'src_report': '/path/to/Analytics_2018-06-29-173717_ML-C02PA037R9QZ.core_analytics',
'diag_start': '2018-06-29T00:00:09Z',
'diag_end': '2018-06-30T00:37:17.660000Z',
'name': 'comappleosanalyticsappUsage',
'uuid': '4d7c9e4a-8c8c-4971-bce3-09d38d078849',
'processName': 'Google Chrome',
'appDescription': 'com.google.Chrome ||| 67.0.3396.87 (3396.87)',
'appName': 'com.google.Chrome',
'appVersion': '67.0.3396.87 (3396.87)',
'foreground': 'YES',
'uptime': '26110',
'uptime_parsed': '7:15:10',
'powerTime': '12537',
'powerTime_parsed': '3:28:57',
'activeTime': '4250',
'activeTime_parsed': '1:10:50',
'activations': '105',
'launches': '0',
'activityPeriods': '12',
'idleTimeouts': '4'}
Figure 2: The CoreAnalytics record from Fig. 1, parsed into JSON by the CoreAnalyticsParser script.
NOTE: The parsed example above is in JSON format, which can be optionally generated with the script with the use of the -j flag. By default, the script will output the same data in CSV format.
The script can run on a live system or run against a directory that contains either CoreAnalytics or aggregate files.
Technical Analysis
The CoreAnalytics Files
The.core_analytics
files contain JSON records that indicate both program execution history and timestamps that define the particular diagnostic period in which the history data was collected. These files, which can be found in /Library/Logs/DiagnosticReports/
, are named with the convention Analytics_YYYY_MM_DD_HHMMSS_<systemname>.core_analytics
. The timestamp in the filename is based on the system’s local time. Prior to 10.13, the DiagnosticReports
folder only contained application fault and crash reports. Now, there is execution data for applications whether they crashed or not.
The analytics daemon, which is responsible for producing and gathering system analytics and diagnostics data, maintains information about previously-written CoreAnalytics files in /private/var/db/analyticsd
. In the root of the directory, the currentConfiguration.json
file appears to maintain a dictionary of names, UUIDs and data types for the different subsystems that report to the daemon.
In the /private/var/db/analyticsd/journals
directory, the da2-identity.json
file maintains a listing of _marker
records from recently generated CoreAnalytics files. The first entry generally predates the first available CoreAnalytics file by 7-10 days and the last entry appears to be one report behind the most recently written. Generally, this data may be used to confirm that all expected .core_analytics
files are present and have not been tampered with.
Defining the Diagnostic Period
The first record in the .core_analytics
files contains a timestamp field, where the value reflects the time that the diagnostic period ended. This timestamp is recorded in local time — but thankfully, in a timezone-aware format. If unaltered, this timestamp should match the last modified timestamp of the file. In other words, the .core_analytics
file is only written to this location after the diagnostic period ends.
{ 'bug_type': '211',
'os_version': 'Mac OS X 10.13.5 (17F77)',
'timestamp': '2018-06-05 17:16:48.19 -0700'}
Figure 3: The first line of a .core_analytics
file, with the diagnostic period end timestamp.
The time that the diagnostic period began can be found in the following JSON record which begins with _marker.
The UTC timestamp is the value paired with the startTimestamp
field.
{ '_marker': '',
'_preferredUserInterfaceLanguage': 'en',
'_userInterfaceLanguage': 'en',
'_userSetRegionFormat': 'US',
'startTimestamp': '2018-06-05T00:19:13Z',
'version': '1.0'}
Figure 4: The second line of a .core_analytics
file, with the diagnostic period start timestamp.
The CoreAnalytics files are written to the DiagnosticReports
directory almost daily, with a near-perfect cutover to the next file during consecutive days of system usage. CoreAnalytics files are not generated on days that the system is asleep or shutdown for the entire diagnostic reporting period.
Diagnostic Period Began | Diagnostic Period Ended |
2018-06-08T01:51:23Z | 2018-06-09T01:50:01.370000Z |
2018-06-10T16:49:09Z | 2018-06-11T03:53:15.140000Z |
2018-06-11T03:53:14Z | 2018-06-12T02:50:17.410000Z |
2018-06-12T02:50:17Z | 2018-06-13T00:17:45.870000Z |
2018-06-13T00:17:45Z | 2018-06-14T01:17:06.340000Z |
.core_analytics
files on consecutive days of use, and imperfect cutover when the system was not awake.
These files are typically generated in the DiagnosticReports
folder daily upon the first sleep or shutdown past 00:00:00 UTC, based on our analysis of the binary plist /private/var/db/analyticsd/Library/Preferences/analyticsd.plist
. This plist records the last submission time and next submission time of the .core_analytics
in Unix Epoch format.
Key: cadence
Type: String
Value:
{ 'bootToken': 1530574585000000,
'lastSubmission': 1531256233,
'nextSubmission': 1531267200,
'osVersion': '17E202',
'version': 1}
Figure 5: Contents of the analyticsd.plist
that reveal last submission and next submission timestamps.
However, our testing shows that the report is generally written at the first sleep or shutdown after the submission time has passed.
System Usage
After the timestamp and marker records, the records produced by thecomappleosanalyticssystemUsage
subsystem reflect the uptime of the system in number of seconds. New records are likely generated by this subsystem after the machine sleeps or shuts down. The sum of the uptime
values in these records reflects the total time that the system was awake. The Uptime
field is simply the uptime
value rounded down to the nearest thousand. The activeTime
field likely indicates the amount of time, in number of seconds, that the system was actively in use while awake. The two records below suggest that the system was awake for two periods of time (approximately 4 and 40 minutes long, respectively) for a total of 44 minutes and 38 seconds, but the system was actively used for only 14 minutes and 26 seconds.
{ 'message': { 'Uptime': 0,
'activations': 2,
'activeTime': 42,
'idleTimeouts': 1,
'uptime': 247},
'name': 'comappleosanalyticssystemUsage',
'uuid': '00866801-81a5-466a-a51e-a24b606ce5f1'}
{ 'message': { 'Uptime': 2000,
Figure 6: Example of two records that reveal system usage information.
Immediately afterward, two records appear that indicate two-hour and one-day heartbeats. The significance of these lines is yet unclear, as these heartbeats do not appear to correlate with the duration of the diagnostic period or the uptime of any of the programs recorded. The field name
'activations': 2,
'activeTime': 824,
'idleTimeouts': 1,
'uptime': 2431}, 'name': 'comappleosanalyticssystemUsage', 'uuid': '00866801-81a5-466a-a51e-a24b606ce5f1'}
BogusFieldNotActuallyEverUsed
may indicate that not only the field, but the data itself has been deprecated.
{ 'message': { 'BogusFieldNotActuallyEverUsed': 'null', 'Count': 7},
'name': 'TwoHourHeartbeatCount',
'uuid': '7ad14604-ce6e-45f3-bd39-5bc186d92049'}{ 'message': { 'BogusFieldNotActuallyEverUsed': 'null', 'Count': 1},
'name': 'OneDayHeartBeatCount',
'uuid': 'a4813163-fd49-44ea-b3e1-e47a015e629c'}
Figure 7: Heartbeat records.
Application Usage
All of the lines that follow are comprised of three keys:name
, uuid
, and message
. The name and uuid
keys map to the particular subsystem that generated the record, while the message field contains a nested JSON record that contains additional data. The most common subsystems and UUIDs that we have observed are:
Sub-System Name | UUID |
comappleosanalyticsappUsage | 4d7c9e4a-8c8c-4971-bce3-09d38d078849 |
comappleosanalyticssystemUsage | 00866801-81a5-466a-a51e-a24b606ce5f1 |
comappleosanalyticsMASAppUsage | 0fd0693a-3d0a-48be-bdb2-528e18a3e86c |
TwoHourHeartbeatCount | 7ad14604-ce6e-45f3-bd39-5bc186d92049 |
OneDayHeartBeatCount | a4813163-fd49-44ea-b3e1-e47a015e629c |
comappleosanalyticsappUsage
subsystem produces a single record per program that is executed.
{ 'message': { 'activations': 105,
'activeTime': 4250,
'activityPeriods': 12,
'appDescription': 'com.google.Chrome ||| 67.0.3396.87 (3396.87)',
'foreground': 'YES',
'idleTimeouts': 4,
'launches': 0,
'powerTime': 12537,
'processName': 'Google Chrome',
'uptime': 26110},
'name': 'comappleosanalyticsappUsage',
'uuid': '4d7c9e4a-8c8c-4971-bce3-09d38d078849'}
Figure 8: Sample .core_analytics
record produced by the comappleosanalyticsappUsage
subsystem.
The nine fields under message
may provide significant forensic value to an analyst.
- activeTime likely provides the number of seconds that a program ran in the foreground.
- activityPeriods likely provides the number of instances that a program was brought to the foreground.
- appDescription is populated by data that is directly pulled from the
Info.plist
that resides within the pertinent program’s application bundle. If the requisite keys from theInfo.plist
are malformed or empty, they may appear as ???.
<CFBundleIdentifier> ||| <CFBundleShortVersionString>
Figure 9: Data Format of CoreAnalytics Record
Below are the keys that were used to populate the record above, from the (<CFBundleVersion>)
Info.plist
located within
Google Chrome.app
.
<key>CFBundleIdentifier</key>
<string>com.google.Chrome</string>
<key>CFBundleShortVersionString</key>
<string>67.0.3396.99</string>
<key>CFBundleVersion</key>
<string>3396.99</string>
Figure 10: Keys from Google Chrome.app
Info.plist
If the program was run as an independent Mach-O executable, or the Info.plist
is either unavailable, malformed, or incomplete, the appDescription
will appear as UNBUNDLED ||| ???.
In the example below, the CFBundleVersion
key in the GlobalProtect Info.plist
was not populated.
com.paloaltonetworks.GlobalProtect ||| 4.0.2-19 (???)
Figure 11: Value of appDescription
for GlobalProtect CoreAnalytics record
- foreground provides a
YES
orNO
string value that indicates whether the program was run in the foreground. - idleTimeouts purpose is unclear as of yet.
- launches likely indicates the number of times the program was launched during the diagnostic reporting period. The value of
launches
will remain at zero if the program was launched prior to the beginning of the diagnostic period. - powerTime, based on our testing, likely reflects the number of seconds that the program was running and consuming AC power.
- processName is derived from the
CFBundleExecutable
key in theInfo.plist
that resides within the pertinent program’s application bundle. - Below is the key that was used to populated the CoreAnalytics record above, from the
Info.plist
located withinGoogle Chrome.app.
<key>CFBundleExecutable</key>
<string>Google Chrome</string>
Figure 12: CFBundleExecutable
Key from Google Chrome.app
Info.plist
If the program was run as an independent Mach-O executable, or the Info.plist
is either unavailable, malformed or incomplete, this field is still populated in the CoreAnalytics record. Through our testing, we have not been able to identify the secondary source used by the analytics daemon to obtain this data.
In some cases, as shown below, the processName
field will be left null. In these scenarios, it is practically impossible to determine which program’s execution has been recorded via CoreAnalytics.
{ 'message': { 'activations': 0,
'activeTime': 0,
'activityPeriods': 0,
'appDescription': 'UNBUNDLED ||| ???',
'foreground': 'NO',
'idleTimeouts': 0,
'launches': 2,
'powerTime': 0,
'processName': '',
'uptime': 24},
'name': 'comappleosanalyticsappUsage',
'uuid': '4d7c9e4a-8c8c-4971-bce3-09d38d078849'}
Figure 13: CoreAnalytics Record for Unnamed Program
- uptime likely reflects the total time, in number of seconds, that a program has been running. This does not include time that the system was asleep or shut down. Certain programs, such as the Dock, will have uptimes that either exactly or nearly match the total uptime of the system during the diagnostic period. In the two records from the System Usage section above, the sum of the uptimes (
247
and2431
) exactly amounts to the uptime of the Dock application as seen below (2678
).
{ 'message': { 'activations': 0,
'activeTime': 0,
'activityPeriods': 0,
'appDescription': 'com.apple.dock ||| 1.8 (1849.16)',
'foreground': 'NO',
'idleTimeouts': 0,
'launches': 0,
'powerTime': 0,
'processName': 'Dock',
'uptime': 2678},
'name': 'comappleosanalyticsappUsage',
'uuid': '4d7c9e4a-8c8c-4971-bce3-09d38d078849'}
Figure 14: CoreAnalytics Record for Dock
In our testing, we identified that a different subsystem, comappleosanalyticsMASAppUsage
, wrote a record to the CoreAnalytics file for Microsoft OneNote. The nested JSON in the message key has a different structure than in records written by comappleosanalyticsappUsage.
Rather than an appDescription
field, this record natively divides the CFBundleIdentifier
and the CFBundleShortVersionString
(along with the CFBundleVersion
) into identifier
and version
fields. The record also lacks all of the other message keys aside from launches
.
{ 'message': { 'identifier': 'com.microsoft.onenote.mac',
'launches': 1,
'version': '15.32 (15.32.17030400)'},
'name': 'comappleosanalyticsMASAppUsage',
'uuid': '0fd0693a-3d0a-48be-bdb2-528e18a3e86c'}
Figure 15: CoreAnalytics record for Microsoft OneNote, produced by a different analytics subsystem.
It is possible that other subsystems produce records with dissimilar data structures.
Staging the Day’s Execution Data
It is possible to recover CoreAnalytics data before it is written to the CoreAnalytics file at the end of the diagnostic period. The/private/var/db/analyticsd/aggregates/
directory serves as a temporary staging location for each subsystem to store analytics data, before the data must be submitted and pushed to the day’s CoreAnalytics file. The directory contains one staging file per subsystem, where the filename is the subsystem’s UUID. For example, the file
4d7c9e4a-8c8c-4971-bce3-09d38d078849
contains data generated by version 1.0 of the
comappleosanalyticsappUsage
subsystem that will be written to a CoreAnalytics file in /Library/Logs/DiagnosticReports/
at the first sleep or shutdown after 00:00:00 UTC. The contents of these files appear to be a set of nested arrays. The array for Google Chrome, for example, appears as below:
< <'Google Chrome', 'com.google.Chrome ||| 67.0.3396.99 (3396.99)', 'YES'>,
<5660, 145, 0, 0, 5, 2, 1020>>
Figure 16: Aggregate data for Google Chrome.
The values correspond to the following fields that would normally appear in a CoreAnalytics record produced by the comappleosanalyticsappUsage
subsystem:
< ,
>
Figure 17: Array structure for aggregate data.
By parsing out the array of values in an aggregate file and correlating them with the fields above, it becomes possible to analyze application usage information for the current day, before it is written out to a CoreAnalytics file.
Conclusion
CoreAnalytics provides a trove of information about the usage of a system and its applications. Program execution history that covers a month of activity can serve a crucial purpose in investigations where collection of evidence is not immediately feasible. Though documentation from Apple may provide additional clarity into the purpose of certain fields and the nature of their values, the analysis above provides a strong basis on which analysts can begin to investigate application activity on macOS systems.For more information on CrowdStrike's Incident Response, Compromise Assessment or Threat Hunting offerings, visit the
CrowdStrike Services page
or please reach out to us at: Services@crowdstrike.com