Contents
How to Access User Guide and Helpful Documents:
The CBC user guide provides an in-depth guide to many topics within the Carbon Black Cloud. The user guide is accessible through the console under the “help” section. The guide is divided into various topics from sensor installation to detailed guides for each page within the console.
User Guide:
The guide is accessible at this link: VMware Carbon Black Cloud User Guide
Useful Links:
Below are some useful links that can provide general information about the CBC Sensor and Console.
- Mastering Carbon Black Endpoint Standard (Training Resource)
- Carbon Black Cloud Endpoint Standard FAQ
- Carbon Black Cloud sensor support
- Carbon Black EDR Supported Versions Grid
- Carbon Black EDR Product Support Lifecycle Policy
- Carbon Black Support Release Lifecycle Status
- Product Release Lifecycle and Support Schedule
- VMware Carbon Black Sensor Installation Guide
- Carbon Black Cloud Sensor AV Exclusions
- Setting Up AV Exclusions in the CBC Console for Other AV
Accessing Release Notes:
Another useful location on the User eXchange is the release note section, found here: Carbon Black Cloud Sensor/Console Release Notes
REPCLI – The Command Line Tool:
What Is REPCLI:
The REPCLI utility is a command line interface for the CBC sensor. Each CBC sensor comes with REPCLI access for limited commands, but to enable the full scope of the utility an authenticated user SID must be specified. While any account can pass basic REPCLI commands, the authenticated commands—including the commands to enable sensor bypass locally or force a sensor to check in with the console—must be issued from the account with the specified SID. The REPCLI user SID can be either a group or an individual user, depending on preference.
VMware Carbon Black recommends enabling authenticated commands by specifying a REPCLI user as this troubleshooting capability is imperative in troubleshooting disconnected sensors and can often prevent having to perform a reinstallation of the sensor to resolve issues. Below are a few of the guides that provide an overview of the REPCLI utility:
How to Use REPCLI:
The REPCLI tool must be accessed through the command prompt, and for any authenticated commands the command prompt must be executed as the authorized SID. Below are some guides for the REPCLI utility that offer more detail:
- How to Access repcli
- How to enable repcli during sensor install
- How to enable repcli on existing sensors
Why Enable REPCLI:
The REPCLI tool expands the sensor’s capabilities and allows for more options when troubleshooting the sensor if it is offline. Some common use cases are below:
- Verify Sensor Status
- Force Sensor to Check In
- Toggle Sensor Bypass
- Update Virus Definition Files
- Run an On Demand Scan
- Gather Logs for Support
Common Issue Types:
Interoperability Issues
Symptoms:
The symptoms of an interoperability issue may vary depending on the applications involved. The most common presentations are overutilization of resources and / or an unexpected application crash.
Diagnostics:
When troubleshooting an interoperability issue, the first step is to collect all possible diagnostics to ensure the Red Canary support team has the data needed to investigate further. It is critical to collect these diagnostics before uninstalling the sensor. The diagnostics available will depend on the OS of the device. See below for details:
Windows:
For Windows devices, two items are useful to the investigation: sensor logs, and process monitor. For more information about collecting these logs, please see the guides below:
MacOS
For Mac devices the best available data will be the sensor logs which can be gathered as shown below:
Linux:
Linux also only requires the sensor logs, pulled as shown below:
How to troubleshoot:
Troubleshooting Unexpected Crashes
In the case the application is crashing, the first step is to ensure that the issue is not an unexpected block. This is best accomplished by searching the investigate page for the device name and the TTPs policy_deny or the ttp policy_terminate:
Additionally, data can be limited further by setting a custom filter for the approximate time of the crash:
If no results are returned from these searches, it is a good indication that the application is crashing without the sensor intentionally blocking, which is indicative of an interop issue. Otherwise, it is likely an unexpected block and should be investigated as such.
Troubleshooting Resource Overutilization
If the issue presents as resource overutilization, the first step is to determine if the issue can be replicated. If the issue presents under specific conditions, put the sensor into bypass and check if the issue persists. As sensor bypass mimics uninstall, if the issue continues to present it is possible that the sensor is not the cause. If the issue is resolved, it indicates an interoperability issue with the sensor.
Temporary Solutions for Interoperability Issues
Any time an interoperability issue occurs, it is best practice to collect all possible data—see diagnostics for more information—and open a support ticket. This ensures that Red Canary can resolve the issue by the most secure means, and possibly look at accounting for the interop issue in future sensor versions. Once the data has been collected, depending on the criticality of the issue, a bypass or API bypass rule can help alleviate the issue in the short term.
To identify the path needed to bypass, identify the application that is crashing or causing the resource overutilization. Once the application is identified, identify the paths that contain the application processes. Once this data is collected, bypass rules can be constructed to alleviate the issue.
In the example below, the problem application would reside C:\users\jdoe\failing application\ and its subfolders:
- Navigate to the enforce drop down then select the policies option in the CBC console menu. Select “Policies” page, then select the desired policy and the “Prevention” tab:
- Select the “+” next to permissions to expand the section. Next, select the “Add application path option at the bottom of the section.
- Input the desired application path, then select “Performs any operation” or “Performs and API operation” and bypass. If it is unclear whether to choose “Performs any api operation” or “Performs any operation”, start by trying to bypass only API operations as this is more limited. When inputting the path, a single “*” character will include a single folder, while a “**” is recursive and will include all sub-folders.
In this example, the “failing application” folder and all sub folders would be API bypassed if they were located within the C:\users\ directory.
4. Once the path is entered, select the “Confirm” button then “Save” to update the policy. Please wait ten minutes to allow time for the sensors to update their policy before retesting.
Important Reminder
A bypass or API bypass rule always results in a lower security posture. A bypass rule causes the sensor to stop monitoring, reporting on, or blocking anything at the targeted path. Because of this, any bypass rule should be as specific as reasonably possible, and if a bypass rule is needed, Red Canary recommends opening a ticket with a support to investigate further.
To ensure the support ticket is handled as efficaciously as possible, please provide the following when opening tickets regarding interoperability issues:
- Device name in the Carbon Black Cloud Console
- Application experiencing interoperability
- Date/Time interoperability occurred
- Application actions performed or blocked
- Are results the same if Sensor is placed in Bypass?
- Remove Sensor from Bypass and attempt to implement a Permissions rule
- Test using a Permissions rule
- Is the interoperability issue reproducible?
- If yes, collect 2 separate Procmon logs:
-
Steps for Windows 3.3.x.x Sensor and earlierorsteps for Windows 3.4.x.x Sensor and higher
- Procmon with the Sensor Active and with the issue reproduced
- Procmon with the Sensor in Bypass taking the same steps as above
- After collecting Procmon logs, request Sensor logs:
- Collecting Sensor Logs Windows
- Collecting Sensor Logs Mac
Unexpected Blocks
Unexpected or false positive blocks are most frequently caused by side effects of path-based policy. Path-based policy supersedes reputation-based policy rules, so if a specific path is denied it will not be allowed to execute regardless of its reputation. Alternatively, some false positives may be caused by inaccurate reputation being assigned to a file.
Symptoms:
There are two indications of a false positive block: either an alert or notification from the console, or an end user reports that an application is crashing. The latter case may also be an indication of an interoperability issue and further investigation will be needed.
Diagnostics:
If it is unclear what is causing the block and a support ticket is required, please provide sensor logs for the operating system:
How to troubleshoot:
The steps to troubleshoot an unexpected block are as follows:
- Find the blocking event. If an alert was not generated, start by searching the investigate page for the blocking event using the following logic:
device_name:name of device AND (ttp:POLICY_DENY OR ttp:POLICY_TERMINATE)
If needed, set the time filter to the approximate time the block occurred to narrow results:
2. Next, expand the event by clicking the “>” symbol at the end of the row:
3. Examine the details pane. Note the event ID and the “reason”
In this case, no reason was given other than the application was running, indicating that a path-based policy rule is at fault. Additionally, checking the “Effective Reputation” section shows the application enjoyed a LOCAL_WHITE reputation at the time of the block, a further indicator that reputation-based rules were not at fault.
4. Next, navigate to the “Enforce” drop down and select the “Policies” page. Select the policy in which the endpoint resides. Navigate to the “Prevention” tab and review the policy to determine which rule caused the block.
5. In this case it is clear the application was terminated due to the rule “**Chrome** Runs or is running Terminate Process”. Adjust the rule as needed to alleviate the false positive block.
Tamper Protection:
There will be times that another security/endpoint monitoring program may attempt to interact with the Carbon Black Cloud sensor and therefore engage the tamper protection feature within the sensor. This behavior can be observed within the SensorAlarms.log for Windows sensors. You can see an example of the log entry below:
xx/xx/xx 12:00:00, Assertions, Assertion Detected: File[d:\Jenkins\workspace\CbD_Build_Windows_Agent_3.8\398\common\repmgr\FiltFile.cpp@438] Function[KernelCommunication::KernelMsgDispatcher::ProcessPolicyRequest] Assert[status == SI_ERR_SUCCESS || status == SI_ERR_NOT_COMPLETE || status == SI_ERR_BAD_PAR
Once the product interacting with the CBC sensor has been identified, third party exclusions may need to be added to prevent scanning or interaction with the CBC sensor. You can reference the link below for recommended third party exclusions.
Further Reading:
Below are some helpful guides when either troubleshooting policy rules or attempting to add rules for enhanced security:
- Application blocked unexpectedly
- Endpoint Standard Policy tuning: Good, Better, Best
- Troubleshooting Configuration VS. Interop
Performance Issue:
Symptoms:
Generally performance cases present in one of four ways:
- High CPU usage
- High memory usage
- High disk usage
- Slow opening / saving files over network
Regardless of the type of slowness experienced, diagnostic collection is similar.
Diagnostics:
The diagnostics gathered will largely depend on the OS:
How to Troubleshoot:
The best way to troubleshoot the issue is using the following steps:
- Identify the circumstances of the issue:
- Does it occur under specific circumstances, or is it a general performance impact?
- Is there anything unique about the endpoints experiencing the issue? Does it only happen on VDI devices, devices running either off premise or on a specific VLAN, devices running a specific OS or sensor version, specific types of devices (Dell devices, Lenovo, etc), or devices only running a specific application
- Place the device in bypass and see if the issue persists.
- If the issue persists, it is likely caused by something other than the Verify this by uninstalling the sensor if needed.
- If the issue is resolved by placing the sensor in bypass, it is likely that a bypass rule can resolve the issue as a temporary fix. Next steps would be to try to determine what processes are causing the interop and target them with bypass rules if possible. If this is not possible, submit the device logs and procmon to support for assistance identifying the process and crafting the rule.
- Even if the issue is resolved by a bypass rule, please open a ticket with support to identify if there are any non-bypass solutions to the performance issue. Every bypass rule carries inherent risk and should be avoided if possible.
Connectivity Issues:
Connectivity issues can occur after the sensor is installed if the sensor becomes unable to contact the console. This often occurs as the result of a change to the firewall or other network changes. Alternatively it can be caused by corruption to a sensor.
Symptoms:
Connectivity issues can present differently depending on the duration of the issue.
- Endpoints may appear deregistered if they have been offline for a long duration
- Endpoints may appear inactive if they have not checked in for 30 days
- Endpoints may appear active, but they are not getting recent policy changes. This is usually indicated by the policy being shown in italics when viewing the endpoints on the policy page.
Diagnostics:
The best way to diagnose the issue is using sensor logs, as well as any OS level logging available. However, in some cases a wireshark capture may also be necessary:
- How to collect sensor logs.
- How to collect a wireshark.
How to Troubleshoot:
Troubleshooting begins with identifying the issue’s breadth, continues with verifying the endpoint’s connectivity, and if necessary, reviewing logging and evidence for detailed errors:
- Is the issue impacting a few endpoints, or multiple endpoints?
- If the issue is impacting an entire environment, verify firewall and proxy settings
- Are there any consistent environmental factors: are the offline devices on the same subnet, for example? Were there any recent changes that may correlate with the issue?
- Verify the disconnected endpoint can access the console
- If it is not possible to determine the issue from the above steps, open a ticket with support providing the results of the above investigative steps along with sensor logs.
Installation Issues:
Installation issues have a variety of causes from environmental misconfiguration to unclean uninstalls or upgrades of the CBC sensor.
Symptoms:
Generally, a failed installation will present as a sensor that either doesn’t install or has an error during the installation process. In some edge cases, the sensor may appear to install on the endpoint but does not check into the console.
Diagnostics:
The most useful documentation for troubleshooting is the sensor installation logs, which can be gathered in a few different ways depending on how the installation is being performed:
- Unattended Installation: This method will create verbose MSI logs in the directory specified after the L*vx command. When scanning the logs, look for the 1603 error (this is a generic MSI error) and then scan a few lines above and below for the actual error.
- For attended install, check for the version.log file in the following directories:
- Windows: C:\Windows\TEMP\
- Windows: C:\Users\<user>\AppData\Local\Temp\
- Windows: C:\Users\All Users\AppData\Local\Temp
- For Mac devices, the equivalent of confer-temp.log on mac is confer-preinstall-xxxxxxxx.log & confer-postinstall-xxxxxxx.log/tmp. Check the following directories:
- Mac: /Applications/Confer.app/
- MacOS Big Sur: /Library/Application Support/com.vmware.carbonblack.cloud/Logs
- For Linux devices, look for the cbagentd-install.log file in the following location:
- /var/opt/carbonblack/psc/log
How to Troubleshoot:
Installation failures can have a variety of causes and resolutions. The following steps should help identify the issue and solution:
- Has this device ever had a CBC sensor installed?
- If yes:
- How was the original sensor deployed? Some deployment solutions will control the sensor’s registry keys and prevent updates via the console.
- What prompted the reinstallation of the sensor? Errors in the previously installed sensor could indicate a corrupted sensor that did not uninstall cleanly.
- If yes:
- Collect and examine the installation logs. Look for the preinstall checks, specifically looking for the registry key checks.
- May need the Sensor Removal Tool
- If this is a new installation:
- How is the sensor being installed? Verify that all relevant instructions are followed for the installation depending on the method:
- Attended Method
- Unattended Method
- How is the sensor being installed? Verify that all relevant instructions are followed for the installation depending on the method:
-
VDI
- Note that the VDI installation process is currently in revision, when the new process releases this guide will need to be verified.
Further Reading
Please review the general sensor installation troubleshooting guide.
Suspected Missed Block
A missed block occurs any time an activity expected to be stopped is allowed to execute, and can be caused by a variety of factors that can make it difficult to diagnose.
Symptoms:
Symptoms generally present in one of two ways: first, an alert may fire containing suspect activity that was expected to be blocked, or no alert is received but the user has other indications of suspicious or unexpected activity.
Diagnostics:
In either case, the best diagnostics available would be the sensor logs. In addition, it is helpful to identify the suspected time of the execution, as well as the name or hash of any files or processes involved in the execution. If the alert is available, the alert ID is also necessary to the investigation.
How to Troubleshoot:
This will largely depend on whether an alert was spawned:
- If an alert was spawned:
- Review the alert to identify the device and the policy in which the activity occurred.
- Check the device policy for rules that would allow the activity.
- Path based policy rules always supersede reputation-based policy rules
- Allow rules always supersede deny rules.
- If an alert was not generated:
- Start by searching the investigate page for the unexpected activity to see if it was recorded without triggering an alert.
- If it is in the investigate page, double check policy for the device to determine if there was a rule that allowed the activity, or if a rule should have denied the activity.
- If the activity was not reported to the console, pull sensor logs and open a ticket with support
- Start by searching the investigate page for the unexpected activity to see if it was recorded without triggering an alert.
Local Signatures
The CBC sensor has two sources of reputation data, the first is the CBC console, the second is the local reputation store. In some cases, endpoints may be unable to download reputation data leading to the local scan version being out of date.
Symptoms:
Most commonly, this issue will present as an error indicator appearing in the signature column of the Endpoints page in the console.
Diagnostics:
The only diagnostics usually required are the sensor logs.
How to Troubleshoot:
When troubleshooting this issue, the best step is to pull sensor logs and review scanhost.log for errors. Additionally, check for a SensorAlarms.log for general errors that may indicate other security products injecting into the CBC sensor. The best way to identify these errors is to search the SensorAlarms.log file for FileTamperAttempt. Additionally, verify that the firewall is set correctly.
If the error is not able to be identified, open a ticket with support providing the sensor logs.
Sensor Upgrade Issues:
Upgrade issues can have a variety of causes, but most often upgrade failures are as a result of an incorrectly configured deployment via a deployment tool.
Symptoms:
Commonly presents as a sensor being unable to upgrade, either stuck in an upgrade pending status in the console, or the sensor version never reflects the completed upgrade. Additionally, the sensor may become partially installed, making log collection difficult. Alternatively, the upgrade may complete but the sensor may be stuck in bypass.
Diagnostics:
The best diagnostics to troubleshoot this are sensor logs, however in the case that sensor logs are not available logs may still exist in the program data directories.
How to Troubleshoot:
It’s important to understand the history of the sensor to best address issue, like an install / uninstall issue.
- How was the sensor installed?
- Check for deployment methods, specifically any deployment tool that may have owned the upgrade process. Common offenders are GPO and SCCM, but other tools may have similar issues.
- For SCCM, ensure deployment was configured according to best practices
- For GPO misconfiguration, often the GPO is set to check for a specific version and will downgrade / roll back the sensor to an earlier specific version.
- Check for other security software that may be tampering with / blocking the sensor upgrade.
- Try a reboot. In most cases, a reboot is not required to complete the upgrade, however in some cases wherein other application updates have the registry locked, the CBC sensor may need the reboot to unlock the registry and finalize the sensor upgrade.
- Check for deployment methods, specifically any deployment tool that may have owned the upgrade process. Common offenders are GPO and SCCM, but other tools may have similar issues.
Additional Reading:
Please review the following guides for detailed information about some common upgrade errors:
CB Sensor Upgrade Doesn't complete until reboot
Sensor upgrades failing to install after initial GPO/SCCM D deployment
Console Issues:
Console issues are most often either transient or caused by an issue with the CBC cloud.
Symptoms:
This can vary, but a few common issues are below:
- Performance issues on pages
- Inability to log in
- Events or alerts missing
- Buttons / links broken or not working
- Pages failing to load
- Alerts failing to dismiss
Diagnostics:
Collecting a HAR file is the only logging to pull for console related issues.
How to Troubleshoot:
Troubleshooting console issues begins with checking the CBC Status page. If a known outage is not at fault, collect a HAR file while reproducing the issue and open a ticket with support. You can also share the file security via Red Canary: Sharing files securely with Red Canary.
Sensor Stuck in Bypass
A sensor can be stuck in a bypass state wherein the sensor will not come out of bypass regardless of disabling the bypass via the console.
Symptoms:
A sensor is showing in bypass on the endpoints page in the console and the bypass cannot be disabled.
Diagnostics:
The best diagnostic for the issue is sensor logs.
How to Troubleshoot:
Troubleshooting devices stuck in bypass is best accomplished as follows:
- Verify that the sensor is compatible with the OS
- This is the most common cause on Linux installations
- Verify that the Linux headers have been installed correctly
- Ensure that there is no other security product installed on the endpoint. This can break the
- For a Windows sensor, pull sensor logs and attempt an uninstall / reinstall of the sensor. If the uninstall / reinstall fails, please see Installation section of this guide.
- For MacOS:
- Check the version of MacOS being run
- Verify the sensor has full disk access
- Verify the sensor has System Extension Approval via MDM or locally, or ensure KEXT Approval for MacOS 10.13 or earlier.
An endpoint stuck in bypass has encountered a catastrophic error. Sensor logs can often diagnose, but most often an uninstall / reinstall is required.
Comments
0 comments
Please sign in to leave a comment.