Field Support Toolbox - Debug procedures

  • Upload
    arella

  • View
    32

  • Download
    1

Embed Size (px)

DESCRIPTION

Field Support Toolbox - Debug procedures. Nick Hurd Technical Director CMSgateways.com. CONNECT / DIRECT is a vital component necessary for Electronic Health Information Exchange Documented success of CONNECT/DIRECT systems Many installations Fulfills various requirements - PowerPoint PPT Presentation

Citation preview

  • Field Support Toolbox

    - Debug procedures

    Nick HurdTechnical Director

    CMSgateways.com

    CONNECT / DIRECT Field Support OverviewCONNECT / DIRECT is a vital component necessary for Electronic Health Information ExchangeDocumented success of CONNECT/DIRECT systemsMany installationsFulfills various requirementsRequirements vary depending on participantsExample: DoD (HW security) vs. other participants (SW Security)Continuous operation will require field service supportRequires communications between different vendors, modules & versionsMany interdependent stages (hops)Troubleshooting dependencies, updates, inter-operabilitySystem problem resolution can require hours/days/weeksReliable operations will require efficient field supportProcesses, tools, personnel, training, documentationField service tools expedite CONNECT / DIRECT acceptance

    CMSGateways.com

    CONNECT/DIRECT Case study:CMS Electronic Report workflowCMSGateways.comHealth CareProvider CMSFeedbackQuality ReportPHI

    CMS electronic report requirementsCMSGateways.comValidityIntegrityPrecisionReliabilityTimelinessAccessSecurityCMSReporting RequirementsFeedbackQuality ReportPHIHealth CareProvider CMS

    Unique modules from different vendors implement and verify each requirementCMSGateways.comHealth Care Provider CMSFeedbackCONNECT /DIRECTQuality ReportPHIIntegrityValidityPrecisionReliabilityTimelinessAccessSecurityDataSourceAccessIntegrityValidityPrecisionReliabilityTimelinessAccessSecuritySecurity

    Data logjam - One problem can stop workflowCMSGateways.comHealth Care Provider CMSFeedbackQuality ReportPHIIntegrityValidityPrecisionReliabilityTimelinessAccessSecurityDataSourceAccessSecurityWheres myreport?CONNECT /DIRECT

    CONNECT/DIRECT Field Support OverviewCurrent Problem Determination (PD) process characteristicsLabor intensive diagnosisManually assemble, correlate, and interpret logsRepetitive, time consuming problem resolution tasksAdvanced skills and extensive debug time (hours/days) required System design has impact on PDAre PD diagnostics integrated into code paths?CONNECT 4.x has begun integration of PD logs & metrics! Poor problem determination processes & lack of PD tools lead toIncreased cost of ownershipDecreased utilizationDecreased market shareDisconnected & mothballed technologyCMSGateways.com

    CONNECT/DIRECT Field Support OverviewField Support Goal: Improve maintainabilityAutomated diagnostic tools Reduced downtime Streamlined diagnostic processes Reduce cost of supportComponents of maintenance:ReliabilityOptimize MTBF (Mean Time Between Failure)AvailabilityTotal time a system is expected to functionMean Time Before Repair (MTBR) ServiceabilityEase of maintenance & repairMinimize MTTR (Mean Time To Recovery/Repair)RAS Reliability, Availability, Serviceability

    CMSGateways.com

    Different modules implement and verify each requirementCMSGateways.comHealth Care Provider CMSFeedbackQuality ReportPHIIntegrityValidityPrecisionReliabilityTimelinessAccessSecurityDataSourceAccessIntegrityValidityPrecisionReliabilityTimelinessAccessSecuritySecurityCONNECT

    Problem scenario #1Data logjam - One problem stops workflowCMSGateways.comHealth Care Provider CMSFeedbackQuality ReportPHIIntegrityValidityPrecisionReliabilityTimelinessAccessSecurityDataSourceAccessSecurityWheres myreport?CONNECT

    Current Debug process - step #1: Manual review of all Logs CMSGateways.comHealth Care Provider CMSFeedbackQuality ReportPHIIntegrityValidityPrecisionReliabilityTimelinessAccessSecurityDataSourceAccessSecurityLOG1LOG2LOG3 LOGnLOG1LOG2CONNECT

    Current Debug process - step #2: Detailed review of log of offending moduleCMSGateways.comHealth Care Provider CMSFeedbackQuality ReportPHIIntegrityValidityPrecisionReliabilityTimelinessAccessSecurityDataSourceAccessSecurityCertificationListCorruptedLOG2No valid Access listCONNECT

    Problem scenario #2 Interactive problems -> Increased MTTRCMSGateways.comHealth Care Provider CMSFeedbackIntegrityValidityPrecisionReliabilityTimelinessAccessSecurityDataSourceAccessIntegrityValidityPrecisionReliabilityTimelinessAccessSource verificationSecurityLOG2No valid Access listLOG7No accessListDatacommSecurityCONNECT

    Problem Scenario #3: Entire system deadlockedCMSGateways.comHealth Care Provider CMSNO FeedbackQuality ReportPHIIntegrityValidityPrecisionReliabilityTimelinessAccessDataSourceIntegrityValidityPrecisionReliabilityTimelinessAccessSecuritySecuritySecurityAccessCONNECT

    Current Debug process - step #1: Manual review of all Logs => unusableCMSGateways.comHealth Care Provider CMSQuality ReportPHIIntegrityValidityPrecisionReliabilityTimelinessAccessDataSourceIntegrityValidityPrecisionReliabilityTimelinessAccessSecuritySecuritySecurityAccessLOG1LOG2LOG3 LOGnCONNECT

    Diagnosis: EXPIRED log account -> Halted log file creationCMSGateways.comHealth Care Provider CMSEXPIRED LOG ACCOUNTQuality ReportPHIIntegrityValidityPrecisionReliabilityTimelinessAccessDataSourceIntegrityValidityPrecisionReliabilityTimelinessAccessSecuritySecuritySecurityAccessLOG1LOG2LOG3 LOGnCONNECT

    CONNECT/DIRECT Field Support OverviewProblem Determination (PD) components Problem management disciplineAutomate Maintenance functionsIdentify RAS tools requirements (Reliability, Availability, Serviceability) PD workflow procedures PD query processPD environmentsRAS tool solutionsOpen source vs. proprietaryDiagnostic information from variety of sources

    CMSGateways.com

    CONNECT/DIRECT Field Support ObjectiveProblem Management DisciplineProblem Documentation: Confirm, categorize, prioritize & publishAcquire relevant Problem Determination (PD) dataAutomate common PD support tasksInvolve all participants: Users, field support staff, 3rd partiesExample: Xref problems lists from other bugs & third party modulesApply tools => observe & control systemExpedite the identification of fault source(s)PD data analysis (Dev team, test team or Field support)Transform intermittent bug => regular bugResolve the mystery cause(s)Implement Bug fix (w/ no side effects)CMSGateways.com

    CONNECT/DIRECT Field Support WorkflowDiagnostic workflow procedures Goal: Acquire relevant diagnostic data Understand operationsCartography - Functional map of complete systemInternals: Modules & data flow Externals: Protocols & states of transactionConfiguration, version controlStandardized update proceduresModule interdependenciesTools and Diagnostic data acquisition processesExtend development & test bench into fieldEnable Users & Field personnel to collect USEFUL diagnosticsCMSGateways.com

    CONNECT/DIRECT Field Support ToolsProblem Determination (PD) automation toolsAutomated data collection Configuration, Input/output, status, versionHeterogeneous environment modules & subsystemsDiagnostic APIs: Logs, traces, events, signals, exceptionsForensic data mining Log merge, parsing, sorting & analysisIdentify events leading up to problemIsolate source(s) of problems

    CMSGateways.com

    CONNECT/DIRECT Problem Determination (PD) ComponentsCMSGateways.com OS

    SignalsLogsTracesExceptionsAssertDrivers/DLL JVM

    FiltersFormattersDiag Info SourceAPIsView & Analysis ToolsModifiers App Svr

    CONNECTAppThread(s)DBMSNet SocketMem BuffOutputStreamConsoleOutput optionsFile SYSTEM 2SYSTEM 3SYSTEM 1

    PD considerationsCMS Quality Report workflow pathsCMSGateways.comCONNECT& other subsystemsIE_EHR(200+ vendors)PMLegacy, cloudDBMSHIECONNECT

    CCD / PQRSVettingXML ParserFile ManagementDBMSProviders CMSHIEHIEHIEHIT matrixHISPHIEHIHXCAXCPDPHRSecurity/Access

    Problem Determination (PD) QueriesProblem Determination Workflow proceduresPD queriesAccurate problem report?Different system?Different state?Different data?Complete problem report via PD queriesUser interviewDiagnostic data acquisition PD procedures

    CMSGateways.com

    Problem Determination (PD) Query #1Is this problem report / observation accurate?Corrupted problem recordIncomplete, unreliable communicationsMisattribution / false correlationIntermittent problem misconstrued => non-intermittent problem w/complex and unlikely set of causes (MSWord=>Win crash)MisrepresentationIncomplete assessment (PS3 malfunction, hidden connector was unplugged)Different operators have different problem tolerances and sensitivitiesSensitivity and vary with time of dayIrrelevant problem (i.e. Observation is too accurate )CMSGateways.com

    Problem Determination (PD) Query #1PD information categories - problem reports Timestamp, PD environment, priority, classification, scope of problemLog augmentation: Track multiple entries by multiple authorsCMSGateways.com

    Problem Determination (PD) Query #2Is it a different system?Automatic or IT updates Trespassing system - foreign intrusionsConfiguration changesThird party add-ons affect code pathsDrivers, driver stacks, DLLs, apps, monitorsDocumentation & processes in placeAutomated version comparison / control programsRollbacks & version control co-ordinationThird partiesDocumented version inter-dependenciesCMSGateways.com

    Problem Determination (PD) Query #3Is system in a different state?System in different mode?User or protocol may have set different mode Improper initChanges in config, registries, resources & routing tablesResource denialFile, stream, or other resourceCorrupted, does not exist, locked by another process/threadOccasional functionsAuto-save, periodic maintenance, internal garbagecollectProgressive data corruption (timing loops, rounding)Progressive destabilization Destabilizing event create wild pointerInitiating event Use wild pointerCMSGateways.com

    Problem Determination (PD) Query #4Did system receive different data?Secret / different boundaries and conditionsSoftware may act differently in different parts of input spaceDifferent logic invoked by chosen option(s) Input corruptionInputted corrupted or intercepted Deus ex machina - Third party influenceFellow developer/tester, other user, hackerAccidental or Ghost input Signals from different peripherals, networksun => Optical mouse RTF from MS Word & MS Wordpad are not the same Consider time & loading as an input

    CMSGateways.com

    Problem Determination (PD) ProcessesPD EnvironmentsDevelopment, System Test, Multi-System Test, Field Install PD Tools Scope of diagnostic dataSystemwide, Server, Application, ModuleComponent interactionsTool providers: Open Source & Proprietary Setup communications between all of the above!

    CMSGateways.com

    Problem Determination (PD) environment #1 Software DevelopmentSoftware Development environmentInteractive Debugging - IDE / Eclipse (or ?)Call stack, variables values, BreakpointsPrintf debugging / TRONASSERTPost-Mortem Debug crash analysisSemantic errors - Static code analysis toolsCMSGateways.com

    Problem Determination (PD) environment #2 System test suiteSystem test suite environmentPurpose: Decrease costs of functional defectsEach Development stage has associated defect resolution costsRequirements, Arch, Construction, System test, Post releaseDefect costs more if caught at later stageField Support => multiple updates => configuration changesCloud/Continuous deployment reduce costs of later stagesTest Input combinations and preconditionsAutomated finite combinational testsGet greater test coverage with fewer testsCompromise test speed vs. test depth Need coverage of non-functional attributesUsability, scalability, performance, compatibility (version), reliabilityCMSGateways.com

    PD environment #3Inter-system bench testInter-system bench testControlled environmentVersion, loading, data mixMulti-vendor, multi-moduleMultiple overlapping errors increase PD complexityControlled debuggingDedicated offline systems => remote test bedProblem determinationBalance performance with Serviceability (RAS)Automated data collectionTest offline analysis procedures - automated & manualCMSGateways.com

    PD environment #4 Field InstallCustomer Install - Field ServiceUncontrolled environmentVersion, loading, data mixMulti-vendor, multi-moduleMultiple overlapping errors increase PD complexityOnline, live debugging#1 Goal of Field Support Keep system online!Can dedicate extra system as remote test bedProblem determinationBalance performance with Serviceability (RAS)Automated data collectionOffline analysis - automated & manualCMSGateways.com

    PD debug mode #1 => Source debugLogic debug of an app moduleHard faults - ASSERT Usually removed from production codeIntermittent problems Stress system to recreate problemIf race condition exists, usually affected by debug processThreading ,memory management issuesDebugger affects timing, can exaggerate or solve problem.Fuzz tests w/random input => irrational border casesCMSGateways.com

    PD debug mode #2 API debugProblems between system componentsHeterogeneous environmentMust track version history of (related) subsystemsInter-DependenciesScripted automated compare look for version deltaAutomated test scriptsVersion dependencies Example: NwHIN protocolsOptionsRace conditions Test configurations => vary timingSystem loading Test configurations => vary sources, sinks & data loads

    CMSGateways.com

    Inter-System Datacomm PDCMSGateways.comFEEDBACK

    Quality Report

    CONNECT& other subsystemsIE_EHRPMDBMSHIEsCONNECT

    Vetting SWParser SWFile ManagementDBMSClaimsProvider CMS

    PD debug mode #3 - CommunicationsCommunication protocols between systemsPD Transaction AnalysisBetween CONNECT and trading partners such as.NIST: Conformance testing against a referenceOther vendors:Interoperability (@ IHE connectathon)CONNECT V4.0 incorporates PD Metric & Error Logs Performance Transaction Type, Payload Error Messages logXDS.b Transaction/datacomm tools & reference [email protected] Test Tools -> http://hit-testing.nist.gov:12080/xdstools2Connectathon: http://www.ihe.net/connectathon/

    CMSGateways.com

    PD debug mode #4 SecuritySecurity Management problems CERT management a time consuming debug issue!Default certificate configuration Obtaining signer certificate from a remote portRemote signer certificate retrievalValidating a remotely-retrieved signer certificate Replacing certificates and signersCertificate expiration monitor and dynamic run time updatesAdvanced certificate and key management issues CERT management toolsWebsphere GUI admin consoleWindows command line => certmg.exeCMSGateways.com

    PD debug mode #5 Intermittent bugField Multi-System Intermittent problems Field Support procedures & tools requirementsSupport Multi-vendor environmentsVersion dependencies of multiple modulesDisparate data sourcesAutomated data collectionMinimize expertise required for data acquisitionAutomate module / code path analysisOffline analysis merges diag data from different sourcesMinimize and localize Performance tradeoffsServiceability (RAS) ANDSystem loading, throughput, stabilityCMSGateways.com

    PD Doc #1Automated Version Documentation! CMSGateways.com OS

    Composite VERSION

    Drivers/DLL JVM

    System VERSION

    VersiondocumentationSCRIPTFCscriptSYSTEM 2SYSTEM 3SYSTEM 1VERSIONSVersionCompare Tool(s) App Svr

    CONNECTAppDBMSCompositeVERSION(Yesterday)CONNECTVERSION

    PD Doc #1 - System config docsSystem DOCUMENTATIONTimely automated gathering of CONFIGModules / subsystems / OSALL VENDORS!Date, time, checksumsAutomated, scripted comparisonEstablish Version / Change history Immediately spot any deltasHelps to map out updates, rollbacks, hotfixes, etc.Some people rely on dump/trace/log for same infoDeltas are not easy to extract and compare

    CMSGateways.com

    PD Doc #2 - Application Logs

    Instrument your code!Log statementsLog data categoriesPerformance counters ( system loading )Stack traces Race conditions ( timeout counters )CMSGateways.com

    PD Doc #2A App Log via JVMCMSGateways.com OS

    SignalsLogsTracesExceptionsAssertDrivers/DLL JVM

    Info SourceAPIsView & Analysis ToolsCONNECT LOG CODEDBMSLOGFILEJAVAConsoleJAVA Admin

    Java JVM LogLoggingredirect Java Console output to log file via Java Logging API. To enable logging perform the following actions:Open Java Control Panel / Admin panelClick Advanced tab. Select Enable Logging under the Debugging option CMSGateways.com

    Java Log optionsOptions:Redirect system.out & system.err To log fileTo network socketTo OutputstreamTo mem bufferRotating Log filesFormattersXML or TextLevels:Severe, warning, info, config, fine, finer, finest

    CMSGateways.com

    App Log control (>JDK 1.4)CMSGateways.com OS

    SignalsLogsTracesExceptionsAssertDrivers/DLL JVM

    Info SourceAPIsView & Analysis ToolsCONNECTLOG CTLCODEDBMSLOGFILEJAVAConsoleJAVA AdminFiltersFormattersModifiersXMLTextFine, finestNet SocketMem BuffOutputStream

    CMSGateways.comJAVA Logging Framework

    Native JVM log components - functionsCMSGateways.comSOCKETCONSOLEFilter to exclude messagesWith a particular keyXMLBUFFERFILETxtConfigurationPer class

    More options Open Source log4JSun Java Log APIUniversalNo external dependenciesGenerally included in proprietarylog4J Log APIIBM ported RAS code => Java => Open SourceMore output optionsFlexible configLonger history, smaller footprint, faster, thread safe

    CMSGateways.com

    log4J More output optionsCMSGateways.comUnix SyslogEmailFilter to exclude messagesWith a particular keyNT event logSOCKETCONSOLEBUFFERFILEXMLTxtHTMLTTCCFormatter Layoutthreadid, class, etcConfigurationPer class / per thread

    Other log4J Log improvementsImproved PerformanceAsynchronous loggers10x throughput and orders of magnitude lower latencySupport for multiple APIsSLF4J Simple logging faadeUSER plugs in log framework at deployment timeCommons Logging Change logging implementation without recompilationAutomatic Reloading of ConfigurationsWithout losing log events while reconfiguration is taking place. CMSGateways.com

    (PD) Mechanisms JVM TraceCMSGateways.com OS

    SignalsLogsTracesExceptionsAssert JVM

    Info SourceAPIsView & Analysis Tools App Svr

    CONNECTAppDBMSMemCircularTracebufferJAVA ConsoleJAVA Control PanelJava.plugin.trace.optionFileBasic, cache, net, security

    ext, liveconnect all

    Java TraceSet initial trace level for Java Web Start applicationChange trace level with API, trigger eventsJVMRI (IBM - RAS Interface, deprecated)JVMPI (Sun Profiling interface, deprecated)JVMTI (JVM / Oracle / IBM Tools interface, current) Set the deployment property deployment.trace.level. Basic, cache, net, security, ext, liveconnect, all

    CMSGateways.com

    CMSGateways.comProblem Determination SolutionsOpen source PDExample: log4JAdvantages:Source available for debugging/extensionsSmall scale projectsCan be customized to emulate proprietary functionalityProprietary PDSystem examples: Websphere, WebLogicAdvantagesSubsystem integration & testing version controlPD tools => problem determinations cover more system components

    WebLogic Log DiagramCMSGateways.com

    IBM Websphere LOG extensionsIBM extensions of log4J Logging domains Nested Diagnostic Contexts (NDC) Mapped Diagnostic Contexts (MDC)CMSGateways.com

    Advantages - Proprietary SolutionsIBM WebsphereJVM log + log4J + proprietary extensionsIntegrate Mainframe experienceStreamlined binary log/trace 3x fasterMulti-Server Log mergeAdvanced Filtering and Admin consolesMerged Open source with proprietary extensions

    CMSGateways.com

    Expand scope of debug info to AppCMSGateways.comHealth CareProvider CMSFeedbackQuality Report(PQRS)PHI - XML

    Expand scope of debug info to App w/many vendors & transactionsCMSGateways.comFEEDBACK

    Quality Report

    CONNECT& other subsystemsIE_EHR(200+ vendors)PMLegacy, cloudDBMSHIEsCONNECT

    Vetting SWParser SWFile ManagementDBMSProvider CMS

    Users want system totally functional Debug tools => systemwide solutions!CMSGateways.comProvider CMSFEEDBACKVettingPre-SubmissionSubmission

    Incentives

    DisincentivesQuality ReportParticipants RolesDate/TimeLocationsVitalsLab ReportsCONNECT& other subsystems Users want problem resolved ASAP User care about MTTR (Mean time to Recovery/Repair)System DeltasNeed to be bridgedTransactionsRemote procsError Handling

    Tools must be able to identify the many sources of system fault(s)CMSGateways.comFEEDBACK

    Wheres myfeedback?Quality Report

    Wheres myreport???CONNECT& other subsystemsIE_EHR(200+ vendors)PMLegacy, cloudDBMSHIEsCONNECT

    RoutersVetting SWParser SWFile ManagementDBMSProvider CMS

    Each subsystem has diagnostic logs Multiple logsSystem wide vs. app specificDefined interfacesImprove code maintenanceScope of diagnosticsSystem wide vs. app specificDefined vs. custom interfacesTradeoffsInteraction Other system componentsOther AppsExternal systemsImpact on system performanceCommunications abilityCMSGateways.comJVM

    App Server

    OS

    Need for composite logsMultiple log functions Sync and parseSystem wide &. app specificDefined interfacesImprove SYSTEM maintenanceScope of diagnosticsSystem wide All interfacesCMSGateways.comJVM

    App Server

    OS

    System Support - delegationHandoff of system supportFrom Programmers to Field supportPlanned transitionEnable programmers to be more efficientCONNECT ImprovementRAS Reliability, Availability, Serviceability(Semi) automatic problem resolutionSystem Modularity

    CMSGateways.com

    CMS report pathways (2014/2015)CMSGateways.comDIRECTQuality Report(PQRS)PHI - XMLLogs / AuditHealth CareProvider CMSIE_EHRPMDBMSReportGeneratorFEEDBACKVettingParserFile ManagementDBMSSource ControlSMTPXDS.bX.509S/MIME`CONNECTCONNECT/ DIRECTODBC

    CMS report componentsCMSGateways.comHealth CareProvider CMSQuality Report (PQRS)PHIPatient medical recordFeedbackSection a. (PM)Org / Provider / DatesICD / CPT/ DRGSection b. (_EHR)Vitals & Labs ResultsSYS BP = xxxCONNECTRegistryLogs / AuditRepositoryCoreServicesGatewayMPIClient InterfaceVettingParserFile ManagementDBMSSource Control

    Review CONNECT Field SupportCoordinated Problem Determination (PD)Goal: Improve RASIncrease Reliability, Availability, ServiceabilityMilestones to goalProblem management disciplineProblem determination workflow procedures RAS tool solutionsOpen source & ProprietaryVendor choice(s) affects procedures, staffing & MTTRMTTR (Mean Time To Recovery/Repair)

    CMSGateways.com

    Review CONNECT/DIRECT PD processesStandardized Field Support RAS proceduresEnable field support and non-programmers to extend supportCollect USEFUL diagnostic infoStart initial diagnostic processInteract with advanced diagnosticsDiagnostic document workflow and debug procedures Cartography - Functional map of complete systemUnderstand Diagnostic data flow - modules & protocols Problem Determination (PD) automation toolsAutomated data collection Diagnostic APIs: Logs, traces, events, signals, exceptionsForensic data mining => log parsing, sorting & analysisIdentify events leading up to problem, Isolate source(s) of problemsCMSGateways.com

    Problem Determination (PD) ToolsCMSGateways.com OS

    SignalsLogsTracesExceptionsAssertDrivers/DLL JVM

    FiltersFormattersDiag Info SourceAPIsView & Analysis ToolsModifiers App Svr

    CONNECTAppThread(s)DBMSNet SocketMem BuffOutputStreamConsoleOutput optionsFile SYSTEM 2SYSTEM 3SYSTEM 1

    CMS Quality Report workflow with CONNECT/DIRECTCMSGateways.comFEEDBACK

    Quality Report

    CONNECT& other subsystemsIE_EHR(200+ vendors)PMLegacy, cloudDBMSHIECONNECT

    CCD / PQRSVettingXML ParserFile ManagementDBMSProvider CMSHIEHIEHIE

    Contact Info

    We are developing a Field Support Toolbox for CONNECT / DIRECT This toolbox will include a variety of Problem Resolution Tools

    Please email any requirements or questions to:

    Nick [email protected]

    Thank you for participating!CMSGateways.com

    ************************