Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
Automated BCF Data Extraction For BIM QC Communication
Antonio J. Romero Requejo
Bachelor’s Thesis
Civil and Construction Engineering,
Raasepori 2019
BACHELOR’S THESIS
Author: Antonio J. Romero Requejo
Degree Programme: Civil and Construction Engineering, Raasepori
Specialization: Structural Engineering
Supervisors: Mats Lindholm, Max Levander
Title: Automated BCF Data Extraction for BIM QC Communication
_________________________________________________________________________
Date 28.10.2019 Number of pages 53 Appendices 3 _________________________________________________________________________
Abstract
According to multiple studies, communication in the AEC industry is a large, evident
problem that should be addressed in order to minimize errors and maximize overall
quality. Simultaneously, the AEC industry is taking a disruptive step by highly integrating
Information Technologies and Automatization in its workflows to accelerate efficiency
and provide better suited solutions. How industry members adopt and change to integrate
this new work approach, will define how the industry will develop and who will emerge as
leader in the next decades.
BIM coordination is now an essential part of the modern construction process, both
consuming and generating large amounts of information that result in the digital model
that will be used to physically build and maintain the object. These large amounts of data
result in an “information overload” situation, leading to dense fragmented data, lack of
accountability, and failure to address problems among other, being this problem
particularly acute as multiple disciplines join the model.
This thesis tries to solve this problem by developing the implementation for an automated
issue (topic) tracking and quality control dashboard reporting system that would, if not
necessarily solve, help to mitigate this problem by turning the current issue tracking
trough single files into a more manageable and integrated way of storing, sharing and
presenting BCF information.
_________________________________________________________________________
Language: English Key words: BIM coordination, Issue (topic) tracking, BCF, Automation, QA, QC
_________________________________________________________________________
EXAMENSARBETE
Författare: Antonio J. Romero Requejo
Utbildning och ort: Ingenjör (YH), byggnads- och samhällsteknik, Raseborg.
Inriktningsalternativ/Fördjupning: Projektering och byggnadskonstruktion
Handledare: Mats Lindholm, Max Levander
Titel: Automatiserad BCF-datautvinning för BIM QC-kommunikation
_________________________________________________________________________
Datum 28.10.2019 Sidantal 53 Bilagor 3 _________________________________________________________________________
Abstrakt
Enligt flera studier är kommunikation inom AEC-industrin ett stort uppenbart problem som bör lösas för att minimera fel och maximera den totala kvaliteten. Samtidigt tar AEC-industrin ett upplösande steg genom att i hög grad integrera informationsteknologier och automatisering i sina arbetsflöden för att öka effektiviteten och bjuda på mer lämpliga lösningar. Hur företag i branschen anpassar sig och förändras för att integrera denna nya arbetsmetod, kommer att definiera hur branschen kommer att utvecklas och vem som kommer att leda under de kommande decennierna. BIM-samordning är en väsentlig del av den moderna byggprocessen. Det både förbrukar och producerar stora mängder information som resulterar i den digitala modellen som kommer att användas för att ”fysiskt” bygga och underhålla objektet. Dessa stora mängder data resulterar i "informationsöverbelastning", vilket leder bland annat, till täta fragmenterade data, otydligt ansvar och misslyckanden med att lösa problem. Detta problem är särskilt akut när flera discipliner ansluter sig till modellen. Det här examensarbetet försöker lösa detta problem genom att utveckla ett automatiskt spårnings- och kvalitetskontrollrapportsystem som skulle, om inte nödvändigtvis lösa, åtminstone hjälpa till att minimera detta problem. Detta genom att göra det aktuella problemet att spåra enskilda filer till ett mer hanterbart och integrerat sätt att lagra, dela och presentera BCF-information. _________________________________________________________________________
Språk: Engelska Nyckelord: BIM-samordning, Ämne spårning, BCF, Automatisering, QA, QC _________________________________________________________________________
Acknowledgements
I would like to thank Max Levander and Ramboll for the opportunity to write this thesis for
the BIM and Digi Center. To my colleagues in Ramboll for all the encouragement and
collaboration.
Likewise, I would like to thank Novia UAS and its faculty members, and staff for the
support and encouragement received during all the years...Tack!
To my parents, sisters and extended family
To Jennie, Filip and Julian…for everything.
Abbreviations
• AEC: Architecture, Engineering and Construction
• BIM: Building Information Modelling. Building Information Model.
• IFC, .ifc: Industry Foundation Classes. File format extension for IFC files.
• BCF, BCF: BIM Collaboration Format. File format for BCF files.
• CDE: Common data environment
• dB, DB: Database
• URI: Uniform resource identifier
• SaaS: Software as a service
• PaaS: Platform as a service
• IaaS: Infrastructure as a service
• RFI: Request for Information
• XML: Extensible Markup Language
• JSON: JavaScript Object Notation
• GUID: Globally Unique Identifier
• DAX: Data Analysis Expression language
0
Table of Contents
1 Introduction .......................................................................................................................... 1
1.1 Background .................................................................................................................... 1
1.2 The BIM Coordination Case ............................................................................................ 1
1.3 Thesis Objectives .......................................................................................................... 2
1.4 Research Constraints ..................................................................................................... 3
1.5 Research approach ........................................................................................................ 4
2 Regarding BIM ...................................................................................................................... 5
2.1 Collaboration in BIM Coordination ................................................................................. 5
2.2 The BIM Coordination process ...................................................................................... 6
2.3 Communication in the AEC industry ............................................................................. 8
2.4 The need for a BIM information manager ..................................................................... 10
2.5 BIM coordination data extraction ................................................................................. 10
2.5.1 BIM coordination data flow .................................................................................. 11
3 File Formats and Tools ........................................................................................................ 17
3.1 The BCF File Format ..................................................................................................... 17
3.2 The JSON file format .................................................................................................... 18
3.3 MS Azure and Cosmo DB .............................................................................................20
3.3.1 Data separation and security ................................................................................ 21
3.3.2 Creating a service ................................................................................................. 21
3.4 FME.............................................................................................................................. 23
3.4.1 Uploading to the server ........................................................................................24
3.4.2 Downloading from the server ............................................................................... 27
3.4.3 Current situation with Cosmo DB and FME ........................................................... 27
3.5 PowerBI .......................................................................................................................28
3.5.1 Reading Cosmo DB data ...................................................................................... 29
3.5.2 Data presentation and analysis ............................................................................. 31
3.5.3 Accessing BCF data directly from PowerBI ........................................................... 36
4 Technical Solutions ............................................................................................................. 38
4.1 The OpenBIM initiative ................................................................................................ 38
4.2 Parsing and data extraction.......................................................................................... 38
4.3 Data wrangling and code ............................................................................................. 39
5 Presentation of Results .......................................................................................................42
5.1 Data Mining BCF Files ..................................................................................................42
5.2 Summary of general results ......................................................................................... 50
6 Conclusion and Further Steps ............................................................................................. 51
7 Bibliography ........................................................................................................................ 53
1
1 Introduction
1.1 Background
Modern Architecture, Engineering and Construction industry (AEC) can no longer be
understood without the benefits that Building Information Modelling (BIM) has offered
the building sector. Increased profits, reduced errors and omissions, faster and shorter
workflows (Stephen A. Jones, Harvery M. Bernstein, 2012) among other reported benefits,
have pushed the industry into a new realm of effectivity and productivity. None the less
there a plenty of challenges BIM and BIM Managers face daily. There are plenty of well-
studied challenges, Andrew Criminale and Sandeep Langar from the University of
Southern Mississippi, in their “Challenges with BIM Implementation: A Review of
Literature” (Langar & Criminale, 2017) identify up to thirty-six individual problems, of
which at least one third of them can be seen as tracing back to, or can cause further down
the line, communication problems and errors. Delays, errors and misunderstandings are
well accredited as one of the main factors leading to problems in the construction industry.
(Pellinen, 2016)
1.2 The BIM Coordination Case
In Max Levander’s, head of the BIM and Digi Center for Ramboll Finland, words “BIM
coordination is about assessing and cross disciplinary coordinating design using BIM”. This
is what current BIM coordination at Ramboll for the Finnish market is at its core. BIM
coordination, in more general terms, could be understood as the process of constructing
a virtual building before any work is done on site, allowing the team to identify, schedules,
cost, design and constructability issues (topics), etc. and it is nowadays an integral part of
any medium to large project. The BIM coordinator’s duties consist, among other, in
reviewing the physical coordination of all design disciplines and systems as a group and, is
ultimately responsible, to determine the clashes and problems in the building model.
Collaboration on the coordination act is difficult and problematic. In other words,
communication between the parties becomes a key issue (topic) and thus is a problem on
itself.
BIM coordination suffers from an “information overload” problem. This is particularly
acute when multiple design disciplines are aggregated to the Building Information Model,
2 increasing exponentially the number and version of issues (topics) in said model, resulting
in fragmentation, lack of accountability, and failure to address problems, among other.
Part of this problem is originating in the nature of the BIM Collaboration Format (BCF), the
default file format used for communication in BIM coordination. Partly due to its file-based
nature, amount of information contained in it, how that information is presented, among
other. Other file types such as spreadsheets and Portable Document Format (PDF) files
are used too commonly used to share information but they are not object of study for this
thesis.
1.3 Thesis Objectives
This thesis tries to solve the above mentioned problem by developing and implementing
a model for an automated tracking and quality control and assurance reporting system
that would, if not necessarily solve, help to mitigate this problem by turning the current
issue (topic) tracking trough single BCF files into a more manageable and integrated way
of storing, sharing issue (topic) information, by presenting data in a simplified manner, and
storing it in a unique centralized source of truth.
This model for the implementation of the automated system would turn current BCF issue
(topic) tracking reports produced with Solibri 1 in to a cloud-centric database that would
allow project managers, heads of departments and/or executives to follow project specific
issue (topic) evolution as well as other aggregated information, by means of a dashboard-
like frontend.
1 Please note that Solibri generated reports are being used as the source for BCF files, but being this and
open format, it can be adapted to other software vendors solutions.
The implementation of such issue (topic) tracking would result in stronger, more effective,
project roles as keeping track of issues (topics) is a central part for leading the project.
Better control results in better project lead which in turn results in better (quality) projects.
Being the main idea behind this postulate, that an always accessible, cloud-centric
database, acting as a software vendor independent central issue (topic) repository,
coupled with a rich, specific, “project intelligence” dashboard providing at-a-glance
insight and project information, with no need to access the raw information or need of
specialized training, would provide project members with a better overview of the project.
3 In other words, it would act as a telemetry like system for BIM coordination providing
decision makers, quick overarching project evolution information2.
2 It is of interest to note that this strengthens the idea of “… A BIM-based quality assurance process, including checking
and analysis of the BIM file, provides a better overview of the building information at an earlier stage. The mere
visual examination of the BIM file will make it easier to form an overall view of the project, not to mention the more
detailed analyses that can be performed.” As described COBIM Series 6: 1.1 Quality Assurance; Client View (Solibri, 2012)
1.4 Research Constraints
Although this research could be applied to a broad range of BIM-related uses cases, BIM
coordination as defined in 1.2 is the central topic of interest in this thesis. Authoring
software, file formats and other, have been limited in scope to the following:
• Typical (in the broad sense) BIM coordination as done by Ramboll Finland.
• Parsing of BCF v.2.1 files as exported by Solibri. Solibri Model Checker 9.8 is used
as reference version. None-the-less, any software capable of exporting BCF v.2.1
files should follow the standard, and thus be suitable for study.
• Implementation of the automation has been achieved using available software
tools at Ramboll. The following have been used:
o Feature Manipulation Engine (FME) by SAFE Software: A data integration
platform with support for various file formats, easy flow data manipulation
and third-party service integration.
o Microsoft Azure: Cloud computing service for building, testing, deploying
and managing applications and services.
o Microsoft Power BI: Business analytics service capable of providing
interactive visualizations and business intelligence.
o Solibri: BIM quality assurance and control software capable of producing
rule-based issue (topic) reports of Building Information Models.
o Python: High-level, interpreted, general purpose programming language.
4
• Data-security and/or protection in no way or form part if this thesis. Further
exploration for mission critical projects is recommended.
1.5 Research approach
This thesis started by studying BIM coordination work methods. Needs, normal
procedures and workflows were studied. Followed a study of the BCF file format, its
contents and structure. Data extraction of relevant information was done by parsing the
file using the above-mentioned tools. One main point of interest for the author was to
produce a workflow that would allow for an easy, comfortable solution from a BIM
Manger’s perspective. The solution presented tries to be as simple as possible to interact
with, eliminating hurdles in the current workflow, not exchanging one for others.
Literature regarding communication problems in the building industry, its origins and
possible solutions, was studied as an integral part of the problem.
The following steps were taken:
1. Identification of typical BIM coordination workflow when dealing with issues
(topics) and problem communication.
2. Communication (in the broad sense), digital communication and digital
collaboration in the AEC industry was studied.
3. Identification of best method to parse BCF file format contained information.
4. Identification of best solution for centralized cloud server repository and necessary
requirements.
5. Development of workflow and automatization scripts to achieve said workflow.
5
2 Regarding BIM
2.1 Collaboration in BIM Coordination
In a recent study published by the International Journal of Project Management, “eight
concepts influencing the development of BIM collaboration” (Liu, et al., 2017) were
identified as key issues (topics) highlighting “the importance of collaboration within
project teams in BIM project delivery”. These were, as listed by the authors, (1) IT capacity,
(2) technology management, (3) attitude and behaviour, (4) role-taking, (5) trust, (6)
communication, (7) leadership, and, (8) learning and experience. Of these findings (2), (4),
(6), (7) are directly related to the underlaying premise in this thesis. That the assumption
that using BIM automatically grants the benefits of BIM is wrong, that it is important the
“how” you use BIM, how you communicate those pieces of information and how you
monitor, control and deal with issues (topics) like data loss, communication issues (topics)
and sub-par efficiency. If BIM coordination efforts are to succeed, they require that the
technology aspect of BIM does not hinder the communication part, and in general, that in
the People-Process-Technology triangle none of its parts is more than the whole.
Figure 1 Prof. Aarto Kiviniemi’s take on People-Process-Technology
6 BIM coordination must be more than issue (topic) solving and must act as a
communication tool that might, and does, suffer from intrinsic problems. That BIM
coordination should not live isolated, restricted to the default available tools, where
information is not easily shared (or only shared with those directly dealing with it i.e. BIM
coordinator and diverse design discipline leads) or understood. BIM coordination
information can provide of additional insights and solutions to the project and to the
business in general if an integrated information management process to deal with its
inherent problems is available. This last point is of great importance as valuable metadata
regarding the companies, clients, specialists and design disciplines involved in the project
is siloed in the file and abandoned once the project is complete.
2.2 The BIM Coordination process
As mentioned before, the BIM coordinator reviews and conducts the symphony that all
design disciplines involved in the construction project play. It is worth noting that this
responsibility lies in the hands of the principal designer who is legally responsible to
coordinate design efforts (Maankäyttö- ja rakennuslaki 5.2.1999/132, 1999), in Finland this
would be, in most cases, the Architect. This BIM coordination effort is sometimes then
offloaded on to a speciality subcontractor like Ramboll. In a process known as clash
detection, if any error understood as, any possible clash, conflict or problem between
design disciplines appears, a report commonly referred as a “topic” is produced with a
tracking software, Solibri for example. This issue (topic) is then classified according to its
severity and design disciplines involved and elevated to “in progress” status. The issue
(topic) is then assigned to those design disciplines that must take part in solving the
problem. Design discipline leads together with the design discipline team members will
work on providing a solution, according to context, experience, cost and design. This
solution is forwarded as an updated design discipline model and its status is moved to
“resolved” after the BIM Coordinator approves the solution provided by the design
discipline leads if no secondary problems are created by the solution. In some cases, issues
(topics) might be deemed no longer relevant or otherwise no longer a problem in which
case its status is moved to “closed”. Issue (topic) type (TopicType), status (TopicStatus),
discipline (TopicLabel), priority (Priority) and stage (Stage) are defined, modified and used
to better deal with the problem and its solution. This definition list is given to all parties
7 before any BIM coordination takes place, agreed updates and modifications to these
definitions may occur in subsequent meetings. This process breakdown summarizes the
key points regarding BIM coordination as exposed in an interview with Sakari Tohmo
(Tohmo, 2019), BIM Project Manager and BIM Coordinator for Ramboll’s BIM and Digi
center Finland at the Espoo office.
For any building design and construction, different design discipline specialists,
contractors, project owners and project members interact with the BIM coordinator
pushing back and forth coordination issues (topics), problems and solutions for the
building model. The coordination process as seen today in the modern AEC industry is
therefore a complex stream of data, a collaboration effort to track, publish and solve all
issues (topics) regarding the construction of a building. This collaborative effort is broad in
scope, resulting in a massive amount of data flow between all parts. It is the BIM
coordinator’s duty and responsibility to encourage and facilitate information sharing and
distribution among all parts. It is a daunting task that will either result in a successful
project, or will finish in chaos and a failed, problem-ridden building. Likewise, the BIM
coordinator must be able to convey the importance of that information being shared with
all parties, and this includes clients and project owners, that not necessarily, most
common than not, might not be technically trained to understand the nature and
importance of said information. This means that shared BIM information must adapt itself
to the language of that one accessing it, so that all project parts can effectively evaluate
and help facilitate the resolution of issues (topics). With this objective, periodical
coordination meetings are arranged to provide with a general overview of project and
issue (topic) evolution, and to deal with other matters that might not have been dealt with
because of its difficulty or schedule.
This is where the problems mentioned before, come to life. There is an over abundancy of
issues (topics) to deal with, their apparent complexity will vary according to the project
member qualifications, on-time communication and data loss could lead to expensive
after-the-fact solutions. These are only but a few of the difficulties the BIM coordinator, as
well as the other parts involved in the project, must deal with.
8
2.3 Communication in the AEC industry
According to a recent study by Plangrid, and Autodesk company, published in 2018 in
collaboration with the FMI Corporation, costs for more than $31 billion3 can be directly
attributed in the USA, to miscommunication and poor project data. (Autodesk, 2018)
3 thousand million
Poor communication is a well-documented problem in the construction industry as shown
by multiple studies, (AbdulLateef, et al., 19-22 June 2017), (Hoezen, et al., 2006) and
other, and it only seems logic that part of that problem would have its equivalent or origin
in the tools and methods common in the AEC industry. The McKinsey management
consulting group, mentions digital collaboration and mobility and, advanced analytics as
two of the five “big ideas poised to disrupt construction” (Mckinsey Global Institute, 2016)
and the World Economic Forum in its Shaping the Future of Construction: A Breakthrough
in Mindset and Technology (World Economic Forum, The Boston Consulting, 2016),
mentions that establishing “industry standards – for communication protocols, for
instance – so that automated and interoperable equipment can be applied widely to
overcome the fragmented and multi-stake holder nature of construction processes” will
lead to increased benefits and significant savings. It also mentions that “insufficient
knowledge transfer from project to project”, “weak project monitoring” and “little cross-
functional cooperation” has led to diminished productivity and performance in the sector.
Communication in the AEC industry is a factual problem.
9
Figure 2 Adapted from: Shaping the Future of Construction A Breakthrough in Mindset and Technology
(World Economic Forum, The Boston Consulting, 2016)
A similar trend is visible in Finland for productivity in the building sector as seen in a figure
by Tilastokeskus as published by (Rakennuslehti, 2017)
Figure 3 Adapted from Tilastokeskus by Rakennuslehti: Value added labor productivity by industry
Communication standards exists in the form of BCF and Industry Foundation Classes (IFC)
for example, but it is not only the existence of a communication protocol that is relevant,
how that collaboration takes place and how that information is presented is critical, as “the
10 impact of BIM on collaboration is understood as a reshaping of an individual’s cognitive
determinants, which influence a team member’s framing of event patterns enacted
throughout project delivery” (Poirier, et al., 2017).
2.4 The need for a BIM information manager
The volume of information produced in normal BIM projects, requires “the need to
explicitly manage project information and information systems” (Froese, 2010). And “..the
information subprocess around the BIM managers reveals the importance of information
management in BIM projects and how it is necessary to clearly redefine the connections
and the interactions between the workflow and the information flow” (Boton & Forgues,
2018). To this extent in the UK, ISO19650 (superseding PAS1192) calls for a BIM
information manager as minimal requirement for any BIM level-2 project (International
Organization for Standardization, 2018), who’s role among other, would be to establish a
Common Data Environment (CDE) to collect, manage and disseminate documentation,
graphical model and non-graphical data for the project team, facilitating collaboration
between project members by enabling integration and coordination of data.
In this way, the BIM information manager, in collaboration with the BIM manager and
coordinator, needs to devise a solution that helps the BIM coordinator do its job. “The
management of construction projects is a problem of information...” (Winch, 2010) in
which not only a lack of information is dangerous but also, a failure to further classify,
compile, filter and present that information in a simple, effective way, that makes sense
for all parties involved, will lead to increased cost and inefficiencies.
2.5 BIM coordination data extraction
In order to better provide better access to data, improved productivity, increased project
information accuracy and better understanding of that data, as opposed to only having,
knowing, or worse of all, not having at all, rethinking the process by which BIM
coordination data is presented becomes critical. The process for an enhanced BIM
coordination data flow process must ensure easy access and digital collaboration from all
parties involved in the project. It must encourage repeatability and traceability to reduce
variability and waste of time and effort, not to mention confusion, misunderstandings and
11 lack of ownership. It should where possible, enhance current BIM coordination, support
and support itself with digital investment efforts company wide, otherwise risking lack of
traction and integration leading to lost time, effort and competitive advantage. The
(Mckinsey Global Institute, 2016) report mentions that the tools that will forward the AEC
industry into the next paradigm fall into three categories. (1) On-site execution, (2) Digital
collaboration and (3) Back-office integration. The tool presented in this thesis falls under
(2) and (3), being this last one maybe less obvious but probably the one to produce better
insight metadata. Data regarding, finance, schedules, human resources and management,
resource planning just to name a few will become apparent. Project specific information
such as which design disciplines tend to produce bigger amounts of clashes, time required
to solve issues (topics) vs. issue (topic) severity, which types of projects require more
attention etc. will be accessible and liberated from their current dormant state,
perpetually siloed in the BIM coordination BCF file. If properly implemented and followed
it is difficult to foresee what kind of insight it might produce once data has been collated,
as it is that metadata, hidden from prying eyes that is less obvious to see.
Enhanced BIM coordination data collaboration supports the efforts currently led by the
BIM and Digi Center unit in Ramboll Finland. Automated quality control reporting has led
to significant improvements in the quality of Building Information Models. This effort has
been developed following a LEAN mentality, where new automated workflows deliver
comprehensive quality overview of delivered Building Information Models with little
interaction and minimal extra workload compared to the benefits they give in return. The
tool proposed in this thesis follows that same mentality. The tool can enhance the current
quality control system and can act as an independent source for other valuable business
information, as once the data contained in the BCF files is parsed and aggregated,
extracting further information is a matter of asking the right questions.
2.5.1 BIM coordination data flow
The current data flow is very simple, yet ineffective . Data produced with Solibri or any
other BCF compliant software, resides in a hopefully unique file that is shared by the BIM
coordinator to all relevant members, who then act on the information contained within,
updating, solving or otherwise, issues (topics) in their BIM authoring software of choice.
Unfortunately, often, this flow results in:
12
1. Multiple BCF files with multiple versions for every design discipline involved,
complicating the labour of collaboration and issue (topic) solving leading to delays,
overwork and over costs.(see figure 5)
2. Easily manipulated, erased, modified or edited issue (topic) information. With no
central supervision control, traceability and accountability of the issues (topics)
reported are lost.
3. Issues (topics) are tracked but no other metadata regarding the project is
produced.
4. Once the project finalizes, the BCF file resides in a project folder “never to be seen
again” . Thus, valuable data becomes siloed and no valuable insights on the
business is gained.
Figure 4 Current BCF data flow
This means that because of the static nature of the file, no other points of interaction
without heavy time investment are produced. Less points of contact with the data equals
to smaller business opportunities.
13
Figure 5 Example of the current file exchange
The proposed automatization (see figure 6) would have the following data flow. The graph
shows how the ingestion, storing and serving processes data would occur. Further
detailing of this process can be read in part 4. It would represent the “equivalent” of each
BCF file transaction as seen in figure 4, but it would only need to happen once per
coordination effort not multiple times. It would allow too, for multiple points of contact
with the data. Figure 7 shows how the proposed file exchange would look like.
14
Figure 6 Proposed data flow
Please refer to part 4.2 for a technical data flow chart of the above-mentioned process.
In this proposed data flow Solibri BCF files, or any BCF compliant file, are read and parsed
with FME or by a Python script. This FME flow(s) or Python script(s) (discussed later) would
parse the file extracting all relevant information. This information would be reworked into
the JSON format. Then the customer project database would be queried for previously
existing data, storing and updating said data. Once this data would reside in the cloud
server, and having learnt how the BCF files work, anything is possible. The proposed
centralization and automatization (see figure 7) would have the following benefits
regarding points of contact with the data:
1. Querying the database (dB) to export relevant data in spreadsheet format.
2. Presenting this data thought a web platform.
3. Creating a website that would allow for online data management and interaction.
4. Visualize, analyse and share valuable information though business intelligence
application such as PowerBI.
5. Regenerate project data into a BCF file to export and share if needed.
6. Allow for integration with BIM authoring software (third party or native).
15
Figure 7 Proposed file exchange
Direct benefits of the new interaction are:
As seen By the BIM coordinator
1. Reduced workload:
a. Only one coordination effort per IFC model update is needed.
b. No need for multiple BCF file deliveries as all parties can access the central.
c. Less updates and sources result in less last-minute changes.
2. Stronger BIM coordinator role
i. Diminished effort waste allows the BIM coordinator to concentrate
in real model coordination.
ii. Higher issue (topic) traceability empowers the coordinator role.
iii. Higher accountability when having a unique source of truth,
empowers the coordinator role.
As seen by the BIM design discipline specialists:
1. Reduced fragmentation results in smaller chance of rework due to old sources.
2. Always accessible source of truth.
a. Teams can manage their commitments inside their areas of accountability
with no need for delays, helping planners in staying on track with the work.
16
b. No need for Requests for Information (RFI) as source is available.
As seen (broadly) by the end user, client or third parties :
1. Better sources of information result in better models.
2. Smaller cost due to rework and mistakes.
3. Better workflow overall smooths all other operations by releasing human
resources.
4. More points of information provide more business intelligence.
5. Increased trust and reduced data loss provide better business intelligence.
6. Simplified data results in better understanding of the whole process.
Note that the end user might not always be the client as seen by the contractor, in this
case Ramboll. Ramboll in many occasions works for a third party (i.e. acting as a
subcontractor), that might be the principal designer or otherwise. In broad terms there is
much to be gained by all parties involved in having an issue free design.
As seen by contractors and other third parties:
Basically, the benefits BIM brings to the table. Better models that result in better value
added, coupled with better insights into their business and better planning by liberating
valuable human resources.
Recapping, more points of contact with the data are created. Better insight into historical
data is achieved. Simplified data extraction is at hand. Data synchronization, tracking and
ownership is fulfilled.
17
3 File Formats and Tools
3.1 The BCF File Format
The BIM Collaboration Format or BCF, is an XML-schema (Extensible Markup Language)
based, human readable file format developed by Tekla (now part of Trimble) and Solibri
and introduced in 2009, designed to enable exchange of data between Building
Information Models and software tools using an open standard that would later be
adopted by buildingSMART for the AEC industry. BCF allows for workflow communication
and can be connected to IFC files. Initially developed as a file-based implementation, a
server-based implementation is available through the BCF-API for full applications via a
RESTful web interface.
File and folder structure as well as file content can be found in its complete form in the
BCF-XML definition page (buildingSMART, 2017). Of the content available in the
definition, Solibri as of current version, exports only a selection of the optional
information. Supported content table is omitted here because of size constraints, please
refer to Table 1 in the Appendix for a detailed list of supported elements.
Being an XML-schema compliant format, BCF files have a tree structure that contains
markup and content, tags, elements and attributes that build up the keys that encapsulate
the information in a human readable text form. Tree structure depiction for BCF Markup
and Visualization Information can be found in the Appendix for clarity and size
constraints.
All this information regarding issues (topics) is then placed in a folder under the GUID
assigned to the topic case. This GUID is unique and identifies the issue (topic) all along the
process, since it is opened until it is closed. Issue (topic) folders are then zipped according
to the .BCFZIP encoding guide described in the documentation. This zipped folder
constitutes the BCF document that can be shared, read and manipulated as needed by
Solibri or any other BCF compliant software. If unzipped and opened, files present a
structure as seen in Figure 8.
18
An example of markup.bcf and visualizationinfo.bcfv contents can be seen in the
Appendix.
The structured text-based nature of these files allows for parsing and document
information extraction with a moderately low effort. Information contained in the
document can then be reformatted to better suit the needs of the workflow. In this case
the intent is to upload this information into a database where it would then be saved and
served. The available option for storing data is a database in Microsoft’s Azure cloud
computing service.
3.2 The JSON file format
JavaScript Object Notation (JSON) is a standardized human readable, language
independent data format derived from JavaScript and commonly used for asynchronous
browser-server communication. JSON is intended as a data serialization format and it
differs from XML in that it does not separate data from metadata and in that it uses a key-
value mapping to address information, where in XML this addressing happens on nodes.
The reasons for choosing the JSON format are:
1. It is an open standard supported by all available tools.
2. It is easy to read.
3. It is easy to validate.
4. Can represent any kind of data.
5. It is widely supported and natural language for the web.
6. It can contain international characters such as Nordic åöä symbols.
7. JSON has no schema and couples nicely with NoSQL databases.
Figure 8 BCF folder and content
19 Parsing the XML to format the JSON document is somewhat trivial once all information
elements have been found in the BCF file. The JSON formatting proposed would take the
same attributes and parameters in the BCF and convert them one to one to their JSON
key-value equivalent. This greatly simplifies the procedure as it follows the same standard
definition in the BCF guide and only modifies its structure, that is, how the information is
presented, not its definition.
Although the method proposed would only search and translate text data as exported by
Solibri it would be of interest to develop a template that would search for all BCF
information and leaving empty value fields in the JSON document as needed. With this,
possibility to transfer any BCF compliant file produced by any software into a database
ready document. This possibility has been tested successfully with the current proposal
with the [Labels],[Priority] and [ReplyToComment.Guid] as means of enhancing BCF data.
Two problems arise with this proposal due to how Solibri is exporting data.
1. As seen in Table 1, project.bcfp is not currently exported. This creates a vacuum of
information that would be needed when tying topic information to a project.
2. Client information has no place inside the BCF file therefore it cannot be linked to
the project.
This means that once data is extracted there is no way to identify client or project
information thus leaving data “orphaned” in the database.
The current workaround is to obtain project name from the BCF filename. In the case that
the filename would be relevant to the project and that it would not be changed during the
project lifetime this would suffice to link BCF data to a project when serving it from the
web. This is in any case just a patch. The BFC-API does contain a definition for project
identification with web services that the proposed workflow adopts as [project_id] and
[name].
A proposal for update for this workaround is discussed under 6 Conclusion and Further
Steps.
An example of the current JSON formatted file follows.
20 { "project_id" : " 6921724f-6fa6-4d0f-ae93-7bc977751521", "name" : "Project Name and Number" "Topics" : [ { "Labels" : [], "xml_topic_guid" : "fea569e9-b4ec-4b87-94e0-666465f197f6", "Title" : "Topic Title as given by BIM coordinator", "TopicType" : "TopicType", "TopicStatus" : "TopicStatus", "Priority" : null, "Index" : "1", "CreationDate" : "2019-01-16T10:41:23+02:00", "CreationAuthor" : "[email protected]", "ModifiedDate" : "2019-04-29T14:44:53+03:00", "ModifiedAuthor" : "[email protected]", "AssignedTo" : "Assigned to text", "Comments" : [ { "Date" : "2019-01-16T10:41:59+02:00", "Author" : "[email protected]", "Guid" : "e0dfaac1-0999-4454-adf7-ae565c11f0fb", "ReplyToComment.Guid" : null, "Comment" : "Comment text 1" } { "Date" : "2019-01-16T17:00:00+02:00", "Author" : "[email protected]", "Guid" : "e0dfaac1-0999-4454-adf7-ae565c11f0fb", "ReplyToComment.Guid" : null, "Comment" : "Comment text 2" } ] } ] }
3.3 MS Azure and Cosmo DB
The available option for data storing is a database in Microsoft’s Azure cloud computing
service. Azure provides Software as a Service (SaaS), Platform as a Service (PaaS) and
Infrastructure as a Service (IaaS) supporting multiple programming languages, tools,
frameworks etc. This solution frees its clients from having to deal with the associated costs
and problems of supporting their own platform. Currently this service is integrated
companywide and thus reduce time, cost and human resources efforts to use. Server set
up effort was aided by Joonas Kiiskinen, developer at Ramboll.
The database of choice is Cosmo DB, a schema-agnostic NoSQL database. It implements
a subset of SQL SELECT on JSON documents does providing a simplified container-like
21 system to store information. BCF information would be extracted, parsed and formatted
into the JSON file format, and uploaded to Microsoft’s services. This choice allows for good
compatibility among file formats and has proper support with the other tools described.
The NoSQL nature of Cosmo DB provides more flexibility than relational databases storing
data in a key-value data structure that perfectly matches JSON files. The flexibility it
provides regarding data structure allows for easy straight forward update of contents if at
any time Solibri commences of ceases support for different information elements, or to
mix-match BCF data from different software tools. Likewise, it needs of no change of the
data structure if it would become necessary to enhance the BCF data with information
contained in other sources.
A mayor problem of this DB is that storing image data as contained by BCF files, is costly
in terms of queries and infrastructure costs. Ideally this data would reside in a blob storage
and would be retrieved simultaneously. This path has not been explored by this thesis for
it would require of an important time and effort investment and for being partly out of
scope.
3.3.1 Data separation and security
In a multi-project, multi-client enterprise environment it is imperative that a secure way to
store and serve this information to relevant project members is devised. None of these are
part of the scope for this thesis. It is highly encouraged that this matter is investigated
thoroughly before deploying the proposed solution in a “live” environment.
The following measures have been taken for security. BCF information is stored in a
project “container” unique for said project. Read and write operations to the database
require of a passphrase. Only one operation is permitted to each solution.
3.3.2 Creating a service
The steps required to create a virtual server are limited to the IT specialist with enough
administrative rights to create said service. They are as follows:
1. Creation of an Azure Cosmo DB service.
2. Filling Subscription information regarding:
22
a. Resource Group.
b. Instance Details and API.
c. Geo-redundancy and Multi-region write.
3. Filling of network information.
To create an information container, it is required to:
1. Click “+ New Container” filling information as required. (see Figure 9)
2. Once the container is created it will wait for data to be sent or read. (see Figure 10)
3. Keys (Read-Write and only-Read) can be fetched from
icon (see Figure 11)
Figure 9 Add new Container
23
Figure 10 BCF data container
Figure 11 Key location. Data was edited for security reasons
3.4 FME
FME (Feature Manipulation Engine ) is a data integration software platform that allows for
easy data manipulation though a visual programming workflow. It is ideal when working
with large data sets and multiple formats.
In FME, nodes are used to read, transform and write data. Data is routed through channels
passing relevant information from one to another. Data can be collected from several
sources and formats, collated, manipulated, enhanced or otherwise written into a third
format. It integrates with Microsoft Azure Cosmo DB and easily reads and writes both XML
and JSON, which makes it ideal for the proposed solution. The current solution works from
FME Desktop standalone solution.
The proposed data flow reads a BCFZIP, unzipping and searching for all markup.bcf files.
Information contained is then parsed and formatted as in the example shown in 3.2. This
data is then saved into a JSON file ready to be uploaded. See figure 12 for the complete
24 workspace required to convert BCF data to JSON. A technical description of the parsing
process required for this purpose is detailed in chapter 4.2.
3.4.1 Uploading to the server
Figure 12 Complete BCFZIP to JSON FME workspace
Further information regarding the inner works of the workspace is provided under part 4
Technical solutions.
The next step once the JSON file has been generated is to upload the data to the server.
In order to upload the data to the database, a writer connection must be stablished.
Data needed to connect to the database is as follows:
1. Cosmo DB Account ID: URI
2. Master Key: Primary Key
3. Database: Database name
Note: Uniform resource Identifier (URI) can be found in the server “overview” panel or
at the “keys” panel where the primary(master) key is also found.
These steps are detailed as screenshots of the process in FME in figure 13 with figure 14
showing the complete workspace.
25
Figure 13 Establishing a Database Connection in FME
26
The resulting workspace collection parameters feature operation should be set to UPSERT
in order to ensure proper field updates. [Collection Name] corresponds to the collection
container name as established in the server.
Running the workspace will result in uploading the data to the server where it will reside
until it is deleted or manipulated.
If the workspace is supplied with new data, it will be inserted in the database (see figure
15). Data that was previously contained and that has been modified will be updated. Care
should be taken not to set Feature Operation to drop or all contained data will be deleted
and replaced by the new data. Care should be taken.
Figure 15 Updated data
Figure 14 Complete Cosmo DB writer FME workspace
27 3.4.2 Downloading from the server
The process to download data from the server works in a similar way. In this case JSON is
downloaded and formatted (see fissure 16).
Figure 16 Complete Cosmo DB to JSON FME Workspace
Data is reconverted into XML compliant data and then fanned out using [xml_topic_guid]
as separation parameter and saved into a BCFZIP file as shown in figure 17.
Figure 17 JSON to XML FME Workspace
Further information regarding the inner works of the workspace is provided under part 4
Technical solutions.
The connection to Cosmo DB happens in the same manner as before. FME will save the
connection parameters that may be reused in this situation. There is no need to create a
new connection to the database server.
3.4.3 Current situation with Cosmo DB and FME
During the writing of this thesis, Microsoft announced it would drop support for non-
partitioned databases in Cosmo DB (see figure 18). More information can be found on
Azure’s documentation (Microsoft, 2019). Partitioning in large clustered systems reduces
28 the likelihood of failure. Also, non-partitioned databases do not scale well and should be
avoided when possible.
Figure 18 Microsoft's announcement
Without going deeper into document-based partitioning and indexing, this “unfortunate”
event means that FME 2018 is no longer capable of read and write operations with Cosmo
DB. Support for partition keys has been announced for FME 2019 (SAFE software, 2019).
More information regarding this issue can be found in FME documentation This plainly
means that as of today, without access to FME 2019 the above described workspaces do
not work. Please refer to in part 6: Conclusion and Further Steps for further details. In any
case this only poses a minor inconvenience as data could still be uploaded by other means
to the server.
3.5 PowerBI
Once data resides in the server, it is possible to serve it to Business Intelligence platforms
such as PowerBI. PowerBI, a business analytics platform from Microsoft. Power BI allows
to read data from several sources, and to easily create information visualizations regarding
said data. More information regarding PowerBI can be found in its documentation pages.
(Microsoft, 2019)
29 3.5.1 Reading Cosmo DB data
It is worth noting again that data access and security has not been a part of this thesis yet
remains an integral part for this to develop as a business case. PowerBI allows for user
access thought Data Analysis Expression Language (DAX) (see figure 19) function
USERPRINCIPALNAME() (Microsoft, 2019). A proper client to project reference should be
developed if the Cosmo DB server is to hold data for more than one client. Please refer to
part 6 Conclusion and Further Steps for more information.
Figure 19 DAX user guide
PowerBI can access data directly from any Cosmo DB by providing URI string. Unfolding
the data in tabular form is a straight forward step (see figure 20). Just by selecting an Azure
Cosmo DB as new data source in PowerBI it is possible to synchronize the dashboard to
the previously extracted BCF data.
30
Ideally all data provided to PowerBI would be populated and would follow a standard. In
this regard there are two problems that would require further attention.
1. Solibri does not populate all data fields by default. This means that data rows will
appear as “null” objects. This is far from optimal, especially in rows where
populated data would live side to side with null values.
2. Populated strings when done by hand tend to have a wide variation in values that
otherwise would be the same. For example, labels like “Arch”, ”ARCH”,
”Architecture” are not the same even if they would be refering to the same
concept. This will cause a problem when analyzing data. Please refer to in part 6:
Conclusion and Further Steps for details on how to address this problem.
Figure 20 PBI Data navigation
Figure 21 Null data
31
Figure 22 Example of Collated data
3.5.2 Data presentation and analysis
Once data has been properly conditioned and parsed it is possible to analyses to provide
further valuable insights, both on the project and on the AEC industry.
Note: This example does not contain real data. It is only to be used as an example of the
applications this thesis .
Data produced in PowerBI can also be distributed though digital collaboration channels
and in mobile devices allowing for a better, more fluid communication between parts. This
is of vital importance as other research efforts, mentioned in the initial parts have shown.
In figures 24 to 27, information regarding the evolution, type and quantity of issues (topics)
is presented in a cohesive manner. Information has been structured in such way that
allows for a much simpler interaction and overview. This method compared to the normal
Excel spreadsheet that Solibri exports (see figure 23) is clearly much more straight forward
and intuitive. In this way both non-technical clients and managers can obtain a better
understanding of the project dynamics. Solibri’s Excel export still has place among the
design discipline specialists but there is much to gain in terms of communication for the
AEC industry using this type of solutions as PowerBI allows for information exploration in
an intuitive way.
32
Figure 23 Solibri Excel export (blurred for privacy reasons)
33
Figure 24 Example of analysed data
34
Figure 25 Information can be analysed exploring though the data
35
Figure 26 Further exploring data
36
Figure 27 Mobil device data sharing
Note: Figures 24 through 27 provide just an example of how much more comprehensible BCF data could be.
3.5.3 Accessing BCF data directly from PowerBI
Being BCF data XML contained in a zip file it was possible to extract file contents directly
in PowerBI by unzipping the file in memory. Never-the-less DAX limited capabilities might
not be the best to attack this problem.
Figure 28 DAX extracted BCF data
37 A much cleaner option is to extract that data and read the XML contents. PowerBI handles
accessing XML data natively. An optimal solution would be to create a PowerBI connector
that would read BCF data directly from the server, but that approach has not been pursued
in this thesis. This solution could be of use in cases where proper standards to fill in BCF
data are followed. Enhancing data could be then achieved by collating information from
other sources assuming proper relations could be formed. An example of how to achieve
this in the Python programming language is provided in chapter 4.2.
38
4 Technical Solutions
4.1 The OpenBIM initiative
Much of this thesis would not be possible without previous efforts led by the OpenBIM
initiative. This collaboration open standard led by buildingSMART has as a goal better
coordination in BIM projects by promoting open collaboration workflows. This thesis does
not implement the definitions in the BCF standard of BCF-API but uses them to achieve
the desired results.
4.2 Parsing and data extraction
The following figure shows how the ingestion, storing and serving data processes from a
machine perspective would occur, as proposed by the author.
Ideally if the whole intent would be to provide with a native client to deal with BCF data in
the BIM authoring software, the complete BCF-API definition should be implemented
allowing for a better, more compact data flow.
Figure 29 Data parsing
39
4.3 Data wrangling and code
Data contained in the BCF file, as mentioned previously, follows the XML standard and
needs to be parsed and collated into JSON before it can be uploaded to the server. This is
done with the FME scripts shown above and attached to the thesis. Follows a description
of how that XML data is mashed up to produce a unique, self-contained source of
information for the project.
The code needed to properly template the source data is as follows:
ROOT { "project_id" : fme:get-attribute("fme_basename"), "Topics" : [ { fme:process-features("MAIN","fme_feature_type",fme:get-attribute("fme_feature_type")) } ] } MAIN { "Labels" : [fme:get-attribute("Labels{0}"),fme:get-attribute("Labels{1}"),fme:get-attribute("Labels{2}"),fme:get-attribute("Labels{3}")], "xml_topic_guid" : fme:get-attribute("xml_topic_guid"), "Title" : fme:get-attribute("Title"), "TopicType" : fme:get-attribute("TopicType"), "TopicStatus" : fme:get-attribute("TopicStatus"), "Priority" : fme:get-attribute("Priority"), "Index" : fme:get-attribute("Index"), "CreationDate" : fme:get-attribute("CreationDate"), "CreationAuthor" : fme:get-attribute("CreationAuthor"), "ModifiedDate" : fme:get-attribute("ModifiedDate"), "ModifiedAuthor" : fme:get-attribute("ModifiedAuthor"), "AssignedTo" : fme:get-attribute("AssignedTo"), "Comments" : [ { fme:process-features("SUB","xml_topic_guid",fme:get-attribute("xml_topic_guid")) } ] } SUB { "Date" : fme:get-attribute("Date"), "Author" : fme:get-attribute("Author"), "Guid" : fme:get-attribute("Guid"), "ReplyToComment.Guid" : fme:get-attribute("ReplyToComment.Guid"), "Comment" : fme:get-attribute("Comment") }
This same work could be done in any other programming language. A similar approach
was taken using Python just to demonstrate its feasibility. The following code unpacks the
40 BCFZIP file in memory and searches for any markup.bcf file. Information is then parsed
and exported as JSON containing all relevant data as an output.
import zipfile import xml.etree.ElementTree as etree import fnmatch import io import json from io import StringIO from lxml import etree from pathlib import Path filename = Path("C:\\example.bcfzip") targetfile = Path("C:\\example.json") issues = [] data = {} with zipfile.ZipFile(filename, 'r') as zfile: for name in zfile.namelist(): if fnmatch.fnmatch(name, '*markup.bcf'): issues.append(zfile.read(name)) data = {} data['issue'] = [] for issue in issues: root = etree.fromstring(issue) for parent in root: if parent.tag == 'Topic': data['issue'].append({ "Guid" : str(parent.get('Guid')), "TopicType" : str(parent.get('TopicType')), "TopicStatus" : str(parent.get('TopicStatus')) }) with open('data.json', 'w') as outfile: json.dump(data, outfile) print("done")
Cosmo DB contains bindings that would allow to code the upload to server functions as
well as to query the database if there would be interest in developing a full featured
application. Developing a full application for this purpose is out of the scope for this thesis
and has not been explored further than producing a proof of concept to parse BCF data.
More information regarding Cosmo DB API bindings for Python and other programming
languages can be found in the official documentation (Microsoft, 2019).
41
Likewise, reformatting JSON to XML compliant BCF data needs templating data as
follows:
<Markup xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="markup.xsd"> {fme:process-features("HEADER")} {fme:process-features("TOPIC")} {fme:process-features("COMMENT")} </Markup> <Topic Guid="{fme:get-attribute("xml_topic_guid")}" TopicType="{fme:get-attribute("TopicType")}" TopicStatus="{fme:get-attribute("TopicStatus")}"> <Title>{fme:get-attribute("Title")}</Title> <Index>{fme:get-attribute("Index")}</Index> <TopicStatus>{fme:get-attribute("TopicStatus")}</TopicStatus> <TopicType>{fme:get-attribute("TopicType")}</TopicType> <CreationDate>{fme:get-attribute("CreationDate")}</CreationDate> <CreationAuthor>{fme:get-attribute("CreationAuthor")}</CreationAuthor> <ModifiedDate>{fme:get-attribute("ModifiedDate")}</ModifiedDate> <ModifiedAuthor>{fme:get-attribute("ModifiedAuthor")}</ModifiedAuthor> <AssignedTo>{fme:get-attribute("AssignedTo")}</AssignedTo> <Description>{fme:get-attribute("Description")}</Description> </Topic> <Comment Guid=""> <Date>{fme:get-attribute("Date")}</Date> <Author>{fme:get-attribute("Author")}</Author> <Comment>{fme:get-attribute("Comment")}</Comment> <Viewpoint>{fme:get-attribute("Viewpoint.Guid")}</Viewpoint> <VerbalStatus>{fme:get-attribute("VerbalStatus")}</VerbalStatus> <Status>{fme:get-attribute("Status")}</Status> </Comment>
42
5 Presentation of Results
5.1 Data Mining BCF Files
As a complimentary effort, data obtained through production BCF files was analysed and
is presented below. Data shown here has been anonymized for confidentiality and privacy
reasons. By no means it was an exhaustive analysis, but it throws some interesting
patterns.
Note: Having a population size of 45 projects a sample size of 41 would be needed to
provide a confidence level of 95% with a margin of error of 5%. No better than 18%
confidence can be obtained with this sample size. Therefore, this analysis cannot provide
statistical significance for the whole BIM coordination effort. Also, the time span for the
selected projects is small, barely one year, which throws off any assumption that could be
made regarding work evolution or work patterns.
Information has been classified in three interest groups. Topics, regarding BCF topic
information. Comments, regarding BCF comment information and, Disciplines &
References, regarding design discipline and linguistic refences.
For this, twenty-five BCFZIP files overarching five BIM construction projects with a total
size of 237Mb of data were analysed. 3825722 items of information where parsed, resulting
in 835 Topics and 671 comments being explored. The project fields where Hospital,
Commercial, Industrial and two Infrastructure. They range from full collaboration from
project beginning to end as well as late stage limited collaboration effort. All projects come
from the backlog of completed projects by the BIM unit’s predecessor in the last two fiscal
years since the date of this thesis but barely span over a 12-month period.
Topic and comment information
Of the 825 topics inspected, 66% had “Error” as a topic type. Solely one item had
“Warning” as topic type. The rest had no information at all, or the information holder was
not there for some reason or it was of other category. The official BCF specification
considers at least on 4 possibilities. Regardless of what the holder could contain this seems
a clear sign of the communication problem the AEC industry is riddled with. Using
exclusively one label, probably does not clearly define the wide variety of problems
43 occurring in during BIM coordination. On the other hand, the quick pace associated with
projects might cause a tendency to concentrate on the real problems and thus limit
communication interaction.
Topic status in its vast majority is “Open”. Considering that nearly all projects analysed
where closed, this seems to indicate that the BIM coordination workflow never really takes
the steps to conclude the information exchange. It might very well be that all those topics
that were never labelled as closed or corrected, were fixed and present no longer an issue
in the model but if so, it is difficult to say as there is no actual confirmation of that
happening. The BCF definition does contain enough information to represent all states an
issue could be at, but clearly, they are not being used to its full potential. It could also
happen that there are later stage BCF files containing those fixes and that the file never
found its way to the server. Topic descriptions have a median of 99 characters and average
of 109 characters and at a closer look they show a reasonable use of the description field.
On the other hand, topic titles are on average are longer, 130 characters and when looking
at the actual information in them, it is seen that in some cases, title containers have been
used to fill in information regarding tagged elements using up to 290 characters.
Comment information shows a similar trend, with a mayor proportion of the comments
never changing verbal status. Likewise, status for comments is being defined as “Error”
or it is not being specified at all. The median length for comments is of 72 characters and
a closer look shows comment fields are used in a reasonable manner.
There would be a need to define an official standard to be used when filling in this data to
achieve better results, otherwise data as in its current form is difficult to clean thus
complicating any mining effort.
44
Figure 30 Topic information
45
Figure 31 Comment information
46 A deeper look into the data showed that a large part of the information was being
conveyed using the comment fields. Unfortunately, they were not used to convey status
or type information but more practical information regarding the issue (topic).
Disciplines and references
Comment and title fields have been used to convey the bulk of the information meanwhile
ignoring other flag fields. In an important part of the topics, the title field was used as a
comment field, ignoring the description field completely in some cases. In order to obtain
better insights, title and comments fields where parsed to search for keywords that could
provide a deeper look into the information.
Approximately two-thirds of the issues are tagged as related to structural or
architectonical parts of the model. The third left corresponds to electrical and plumbing.
55% of the comments relate to construction elements (beams, slabs, pillars, foundations
and roofs) the 45% left mention technological elements (cables, tubes and channels,
windows, doors and booths). A deeper look into topics with unspecified topic status
showed that they refer in a higher proportion to electrical (ca. 60%) and plumbing (ca.
70%). It could be that the tools used for BIM coordination in those design disciplines, do
not facilitate filling in information.
47
Figure 32 Discipline and language analysis
48
Figure 33 Mentions with unspecified or other Status
49
Work effort
Another interesting possibility of being able to mine data that otherwise might get lost, is
to obtain data regarding work patterns. The information analysed here was not linked to
other model information such as model size, construction type etc. nor it is linked to
macro-economic data, or workforce information so it is difficult to infer conclusions.
None-the-less a couple patterns showed up.
1. Different coordinators work more effectively at different moments.
2. A closer look into project specific data showed a tendency to flag issues early in the
project phase or in the late stages.
No conclusions have been proposed regarding this data as they do not provide enough
statistical significance and they are just presented as patterns seen in the limited data and
as example of what is possible by mining BIM coordination data.
50
5.2 Summary of general results
Results are presented as a list of items in no order of importance:
1. BCFZIP files are a zipped archive of folders containing XML information.
2. .bcf files are structured following the buildingSMART BCF definition.
3. Solibri exported BCF files only contain a portion of the defined data in the BCF
definition.
4. Solibri BCF data can be enhanced if needed from other sources but it will not be
seen in SMC.
5. FME requires of proper schema definition when dealing with varied data. This
means that if support would be to be provided to other BCF definitions rather than
2.1. a similar effort needs to be done in order to support it.
6. Likewise, it will probably be problematic to give support for different BCF readers.
7. There is a lack of standardization in how data is filled or a failure to fill in properly
data. This vastly complicated data mining.
8. Valuable information can be implied and obtained by mining BCF files.
9. Maintaining complete BCF information including images is somewhat problematic
with FME and was not managed during the efforts for this thesis. Implementation
should come in the form of a coded app following Microsoft’s documentation
(Microsoft, 2019) or by integrating Microsoft Blob storage server into the solution.
10. Otherwise uploading document information and presenting it with the tools
provided works seamlessly.
51
6 Conclusion and Further Steps
Parsing through the BCF information is a tedious process that can be hugely simplified by
applying automated workflows. This automation needs to look at all single elements
contained, or it would not be able to work properly as a server like substitution.
Unfortunately maintaining complete reference to the BCF definition is more complex than
what can be obtained easily from FME for BIM server like function. It would also mean that
third parties would require having access to FME. FME Server and FME Cloud would,
probably, be a much better solution in this case as they would allow to share workspaces
easily without the need to share FME scripts, but this cannot be asserted as they are not
part of the currently available tools and no exploration of their capabilities has taken place.
Considering problems with images, changes in data and considering the effort required to
maintain compatibility it would be of much greater interest to, to integrate available tools
like BIM-server (van Berlo & Krijnen, 2014), an open source effort by the Netherlands
Organization for applied scientific research (TNO) and the Eindhoven University of
Technology into the current workflow. This would provide software vendor independence
but would require of a bigger internal effort to keep up and maintain. No exploration of
BIM-server or its capabilities has been done.
Regarding the main objective of this thesis, to automate the extraction of BCF data to
enhance QC/QA communication in BIM coordination, it has been shown that not only it is
possible, but that valuable information can be obtained. Extracting and enhancing BCF
data would increase information accessibility with a minimal effort. It is worth noting that
the insight obtained from extracting BIM coordination information would only be as good
as the information set in, therefore the following steps are recommended, in order.
1. To establish the market feasibility of providing this data. To this end a proper, well
formatted, sufficiently big project should be used as an example. Automating
processes is time consuming and bigger gains would be obtained from “more data”
which means bigger BIM projects.
2. To standardize and enforce how data fields are being filled. Engaging in a data
recollection if no data is generated or if the data is of bad quality is futile.
52
3. If support for cloud served BCF information is desired in PowerBI dashboards,
exploration of user access and data security must be done. Availability to FME 2019
is required. If no support for BCF server like compatibility is required, data
extraction and enhancing could be greatly simplified.
4. To actively gather and prepare BIM coordination data currently produced and to
collate it with other business information for further insight discovery.
5. To explore the possibility of further extracting and collating information from the
IFC relevant files for richer insights. Collating SharePoint document information
and other economic data would be beneficial.
6. To explore the feasibility of using open BIM-server in the current workflow unless
using a commercial provider of such services is found to be the best business
solution.
7. To actively search collaboration with Solibri for better BCF support and to devise a
method to enhance BCF information.
Lastly, it is of interest to extend point number two: Even if the information contained in
BCF files relates to technical problems, it is stored and presented as text. In other words,
technical information is communicated in language form, not mathematically. This means
that linguistics plays here a bigger role than mathematical and numerical expressions. This
presents a problem that should be addressed from the very beginning if we are to solve
communication problems when using BCF files. Adopting the use of the priority tag,
currently not supported by Solibri would help to convey important information in a
numerical way, also a well-defined standard could fill in the “missing gaps” and help to
convey better information. This standardization effort should be of highest priority.
53
7 Bibliography
AbdulLateef, O., Seong, Y. T. & Lee, F. K., 19-22 June 2017. Roles of communication on performance of the construction sector. Primosten, Croatia, Elsevier.
Autodesk, 2018. New Research from PlanGrid and FMI Identifies Factors Costing the Construction Industry More Than $177 Billion Annually. [Online] Available at: https://www.plangrid.com/press/fmi/ [Accessed 01 08 2018].
Boton, C. & Forgues, D., 2018. Practices and Processes in BIM Projects: An Exploratory Case Study. Advances in Civil Engineering, 6 August.Volume 2018.
buildingSMART, 2017. BCF Documentation. [Online] Available at: https://github.com/buildingSMART/BCF-XML/tree/release_2_1/Documentation [Accessed 26 08 2019].
Business Dictionary, n.d. [Online] Available at: http://www.businessdictionary.com/definition/information-silo.html [Accessed 19 08 2019].
Davies, K., Wilkinson, S. & McMeel, D., 2017. A review of specialist role definition in BIM guides and standards. Electronic Journal of Information Technology in Construction, Volume 22, p. 185–203.
Froese, T. M., 2010. The impact of emerging information technology on project management for construction. Automation in Construction, August, 19(5), pp. 531-538.
Hoezen, M., Reymen, I. & Dewulf, G., 2006. The problem of communication in construction, Enschede, The Netherlands: ResearchGate.
International Organization for Standardization, 2018. Organization and digitization of information about buildings and civil engineering works, including building information modelling (BIM). [Online] Available at: https://www.iso.org/standard/68078.html [Accessed 26 08 2019].
Jones, S. a. H. B., 2012. The Business Value of BIM in North America.. s.l.:Smart Market.
Langar, S. & Criminale, A., 2017. Challenges with BIM Implementation: A Review of Literature. Seattle, Wa., Associated Schools of Construction.
Liu, Y., Nederveen, S. & Hertogh, M., 2017. Understanding effects of BIM on collaborative design and construction: An empirical study in China. International Journal of Project Management, 35(4), pp. 686-698.
Maankäyttö- ja rakennuslaki 5.2.1999/132, 1999. [Online] Available at: http://finlex.fi/fi/laki/ajantasa/1999/19990132 [Accessed 18 09 2019].
54 Mckinsey Global Institute, 2016. Imagining construction’s digital future. [Online] Available at: https://www.mckinsey.com/industries/capital-projects-and-infrastructure/our-insights/imagining-constructions-digital-future [Accessed 17 08 2019].
Microsoft, 2019. Azure Documentation. [Online] Available at: https://docs.microsoft.com/en-us/dotnet/api/microsoft.azure.documents.client.documentclient.createattachmentasync?redirectedfrom=MSDN&view=azure-dotnet#overloads [Accessed 26 08 2019].
Microsoft, 2019. Cosmo DB Documentation. [Online] Available at: https://docs.microsoft.com/en-us/azure/cosmos-db/ [Accessed 26 08 2019].
Microsoft, 2019. DAX Guide. [Online] Available at: https://dax.guide/userprincipalname/ [Accessed 18 06 2019].
Microsoft, 2019. Migrate non-partitioned containers to partitioned containers. [Online] Available at: https://docs.microsoft.com/bs-latn-ba/azure/cosmos-db/migrate-containers-partitioned-to-nonpartitioned [Accessed 19 08 2019].
Microsoft, 2019. PowerBI Documentation. [Online] Available at: https://docs.microsoft.com/en-us/power-bi/ [Accessed 2019 08 26].
Pellinen, P., 2016. Developing design process management in BIM based project involving infrastructure and construction engineering, https://aaltodoc.aalto.fi/handle/123456789/19964: Aalto University.
Poirier, E., Forgues, D. & Staub-French, S., 2017. Understanding the impact of BIM on collaboration: a Canadian case study. Building Research and Information, 45(6), pp. 681-695.
Rakennuslehti, 2017. Rakennusalalla työn tuottavuus ei ole kasvanut 40 vuodessa – onko allianssista tai leanista apua?. [Online] Available at: https://www.rakennuslehti.fi/2017/09/rakennusalalla-tyon-tuottavuus-ei-ole-kasvanut-40-vuodessa-onko-allianssista-tai-leanista-apua/ [Accessed 15 09 2019].
SAFE software, 2019. FME Documentation. [Online] Available at: https://docs.safe.com/fme/html/FME_Desktop_Documentation/FME_ReadersWriters/documentdb/format_parameters_w.htm [Accessed 08 2019].
Solibri, I., 2012. COBIM: Common BIM Requirements. s.l.:COBIM Project.
Stephen A. Jones, Harvery M. Bernstein, 2012. Smart Market Report: The Business Value of BIM in North America. Design and Construction Intelligence.
55 Tohmo, S., 2019. BIM Coordination in Ramboll Finland [Interview] (01 06 2019).
University of Cambirdge Dictionary (on-line), n.d. [Online] Available at: https://dictionary.cambridge.org/dictionary/english/information-overload [Accessed 17 08 2019].
van Berlo, L. & Krijnen, T., 2014. Using the BIM Collaboration Format in a server based workflow. s.l., Elsevier.
Winch, G. M., 2010. Managing Construction Projects: An Information Processing Approach. 2nd ed. Hoboken, NJ, USA: Blackwell Publishing.
World Economic Forum, The Boston Consulting, 2016. Shaping the Future of Construction: A breakthrough in Mindset and Technology, s.l.: World Economic Forum.
1
Appendices
A1 Definitions
Information overload: A situation in which you receive too much information at one
time and cannot think about it in a clear way (University of Cambirdge Dictionary (on-
line), n.d.)
Information silo: Any information management system that is unable to communicate
with other information management systems, even if otherwise related or within the
same organization. This can be by design or by choice for a variety of reasons, though
nowadays generally frowned upon because of the lack of accessibility and implied
limitations to productivity. (Business Dictionary, n.d.)
GUID (UUID): 128-bit number used to identify information in computer systems. When
generated following the standards, the GUID are for all practical purposes a unique
identification tag with negligible probability of duplication.
UPSERT: Database relevant operation that Updates information previously contained
and inSERTs any data not contained.
2
A2 Tables
Table 1 BCF information as exported by Solibri
Content Status Information Support
File: bcf.version Exported Yes
File: project.bcfp Not exported ProjectID No
Name No
ExtensionSchema No
File: markup.bcf Exported Yes
Header IfcProject Yes
IfcSpatialStructureElement No
isExternal Yes
Filename Yes
Date Yes
Reference No
Topic Guid Yes
TopicType Yes
TopicStatus Yes
ReferenceLink No
Title Yes
Priority No
Index Yes
Labels No
CreationDate Yes
CreationAuthor Yes
ModifiedDate Yes
ModifiedAuthor Yes
DueDate No
AssignedTo Yes
Description Yes
Stage No
BIMsnippet
(optional)
SnippetType No
IsExternal No
Reference No
ReferenceSchema No
DocumentReference
(Optional)
Guid No
IsExternal No
ReferencedDocument No
Description No
RelatedTopic (optional) RelatedTopic/GUID No
Comment Date Yes
Author Yes
Comment Yes
Viewpoint Yes
ModifiedDate No
ModifiedAuthor No
Viewpoints Viewpoint Yes
Snapshot Yes
Index Yes
File: Visualization
information (.bcfv)
Exported Yes
3
Components IfcGuid Yes
OriginatingSystem Yes
AuthoringToolId Yes
OrthogonalCamera No
PerspectiveCamera CameraViewPoint Yes
CameraDirection Yes
CameraUpVector Yes
ViewToWorldScale Yes
Lines Yes
ClippingPlanes Yes
Bitmap Yes
4
A3 BCF file structure
Markup file formatting and structure example
Edited for brevity. To be used only as example
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <Markup xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="markup.xsd"> <Header> <File IfcProject="1VZ3lAMxD6TwQuFEjDd_1J" isExternal="false"> <Filename>C:\ProjectFolder\IFC\MEP\MEP.ifc</Filename> <Date>2019-04-24T08:47:39+03:00</Date> </File> <File IfcProject="1m4oIhZYb9BgneVY_Q6PM6" isExternal="false"> <Filename>C:\ProjectFolder\IFC\HVAC\HVAC.ifc</Filename> <Date>2019-04-26T16:10:25+03:00</Date> </File> <File IfcProject="3hUqdomErBxfbxDHBcLTAm" isExternal="false"> <Filename>C:\ProjectFolder\IFC\ARCH\ARCH.ifc</Filename> <Date>2019-04-26T10:03:25+03:00</Date> </File> <File IfcProject="2Q2bxfcjb2duwWv8OtLn_j" isExternal="false"> <Filename>C:\ProjectFolder\IFC\STR\STR.ifc</Filename> <Date>2019-03-26T12:27:10+02:00</Date> </File> <File IfcProject="17lEiunsLBJwDfsGoEF0hw" isExternal="false"> <Filename>C:\ProjectFolder\IFC\ELEC\ELEC.ifc</Filename> <Date>2019-04-08T16:06:15+03:00</Date> </File> </Header> <Topic Guid="5c4d9e12-7623-4384-a8f0-f080ec12d617" TopicType="Error" TopicStatus="Open"> <Title>MEP</Title> <Index>28</Index> <CreationDate>2018-12-12T16:40:02+02:00</CreationDate> <CreationAuthor>[email protected]</CreationAuthor> <ModifiedDate>2019-04-29T14:28:57+03:00</ModifiedDate> <ModifiedAuthor>[email protected]</ModifiedAuthor> <AssignedTo>MEP</AssignedTo> <Description>Description text regarding the issue in the model. 15.3.2019</Description> </Topic> <Comment Guid="62a1ddd3-5750-473f-acf0-c8a8ab461ee1"> <Date>2019-04-29T14:28:57+03:00</Date> <Author>[email protected]</Author> <Comment>Comment regarding closing the issue by the BIM coordinator</Comment> <Viewpoint Guid="b30121f1-6c34-4410-b291-f148382b6f25"/> </Comment> <Comment Guid="3223dd2e-bddc-4649-9ad4-1f11bdad1d61"> <Date>2019-04-26T16:48:42+03:00</Date> <Author>[email protected]</Author> <Comment>Specialist comment regarding the issue and solution</Comment> <Viewpoint Guid="b30121f1-6c34-4410-b291-f148382b6f25"/> </Comment> <Comment Guid="b4f7f4da-f169-430c-9c61-d62d722600db"> <Date>2019-01-16T10:06:55+02:00</Date> <Author>[email protected]</Author> <Comment> Initial comment regarding closing the issue by the BIM coordinator</Comment>
5 <Viewpoint Guid="b30121f1-6c34-4410-b291-f148382b6f25"/> </Comment> <Viewpoints Guid="b30121f1-6c34-4410-b291-f148382b6f25"> <Viewpoint>viewpoint.bcfv</Viewpoint> <Snapshot>snapshot.png</Snapshot> <Index>0</Index> </Viewpoints> </Markup>
6
Figure 34 Markup.bcf tree structure
7
Visualization information file formatting and structure example
Edited for brevity. To be used only as example
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <VisualizationInfo xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" Guid="a0a9e2e2-c733-4124-b051-33f003840d9f" xsi:noNamespaceSchemaLocation="visinfo.xsd"> <Components> <ViewSetupHints SpacesVisible="false" SpaceBoundariesVisible="false" OpeningsVisible="false"/> <Selection/> <Visibility DefaultVisibility="true"> <Exceptions/> </Visibility> <Coloring> <Color Color="26408080"> <Component IfcGuid="0Td2Y6Sv95$vNiJafd_ZS0"> <OriginatingSystem>Autodesk Revit 2018 (ENU)</OriginatingSystem> <AuthoringToolId>6530118</AuthoringToolId> </Component> </Color> </Coloring> </Components> <PerspectiveCamera> <CameraViewPoint> <X>43.74260720185285</X> <Y>49.24594138728482</Y> <Z>26.980588620097862</Z> </CameraViewPoint> <CameraDirection> <X>0.9776150899404781</X> <Y>-0.20374505078485286</Y> <Z>0.0525041922264222</Z> </CameraDirection> <CameraUpVector> <X>-0.051399786142067076</X> <Y>0.010712244671349801</Y> <Z>0.9986207036701428</Z> </CameraUpVector> <FieldOfView>60.0</FieldOfView> </PerspectiveCamera> <Lines> <Line> <StartPoint> <X>36.779159366198634</X> <Y>21.659073784070543</Y> <Z>31.469038641480623</Z> </StartPoint> <EndPoint> <X>36.78830514445061</X> <Y>21.185525743716546</Y> <Z>31.135907529698237</Z> </EndPoint> </Line> <Line> <StartPoint> <X>42.102494359999994</X>
8 <Y>33.588050611563396</Y> <Z>29.92</Z> </StartPoint> <EndPoint> <X>42.095766615800706</X> <Y>33.742504156223234</Y> <Z>30.0746</Z> </EndPoint> </Line> </Lines> <ClippingPlanes> <ClippingPlane> <Location> <X>45.462054261051016</X> <Y>0.8780225697631043</Y> <Z>0.0</Z> </Location> <Direction> <X>-0.9998135502620163</X> <Y>-0.019309705136600516</Y> <Z>-0.0</Z> </Direction> </ClippingPlane> </ClippingPlanes> <Bitmap> <Bitmap>PNG</Bitmap> <Reference>171e63aa-d5af-46f4-b398-b7a8fac5636a/bitmaps-a0a9e2e2-c733-4124-b051-33f003840d9f-0.png</Reference> <Location> <X>28.209298923616686</X> <Y>31.96636417496406</Y> <Z>27.36675</Z> </Location> <Normal> <X>-0.9364350059127844</X> <Y>0.350841103209307</Y> <Z>0.0</Z> </Normal> <Up> <X>-0.350841103209307</X> <Y>-0.9364350059127844</Y> <Z>-0.0</Z> </Up> <Height>154.6800000000019</Height> </Bitmap> </VisualizationInfo>
9
Figure 35 Visualizationinfo.bcfv tree structure