View
2
Download
0
Category
Preview:
Citation preview
!!"#"!$
!
WORKFLOWS AND DATA INTEGRATIONVISION AND SUSTAINABILITY
NICOLA MARZARI, EPFL
MARVEL
OUTLINE
1. Provenance and reproducibility of data
2. Materials’ properties from workflows, turnkey solutions
3. Interoperability of codes
4. Curation of data
5. Services to the community and scalability of efforts
!!"#"!$
%
MARVEL
COMPUTATIONAL SCIENCE AS A MIDDLE-AGES WORKSHOP
reproducibleoften not possible from the data
reported in papers
searchablefind existing calculations,
reuse them and data-mine
reliableresults persisted in repositories,
automated procedures to reduce errors and verify results
shareablecommunity to share results,
cross-validate them, and boost scientific discovery
COMPUTATIONAL SCIENCE SHOULD RATHER BE…
!!"#"!$
&
MARVEL
ADES MODEL FOR COMPUTATIONAL SCIENCE
G. Pizzi et al., Comp. Mat. Sci 111, 218G. Pizzi et al., Comp. Mat. Sci 111, 218-G. Pizzi et al., Comp. Mat. Sci 111, 218-230 (2016)
Low-level pillars User-level pillars
Automation Data Environment Sharing
Automation Database Research environment SocialRemote management Provenance Scientific workflows SharingHigh-throughput Storage Data analytics Standards
AN OPERATING SYSTEM FOR SIMULATIONShttp://www.aiida.net (BST-MIT license)G. Pizzi et al., Comp. Mat. Sci. 111, 218 (2016)
!!"#"!$
'
MARVEL
AiiDA INFRASTRUCTUREWhat is AiiDA?
MARVEL
PROVENCANCE AND REPRODUCIBILITY
!!"#"!$
(
MARVEL
DIRECTED ACYCLIC GRAPHS
Nodes:CalculationsCodes Data
MARVEL
!"#$!%&"'()*%$"+!,-"$%!'.(/".$0'1#'
)*+
,-./0/1234536524781591:8;18:-
%<5=+>5;/.;8./12349365?-63:@-?591:8;18:-9
+219
A./912;51-493:9
!!"#"!$
B
MARVEL
NoSQL flexibility within an efficient SQL schemaNoSQL flexibility within an efficient SQL schema
,-739213:C5+3.?-:
DbNode: entry for each node. DbLink: all links. Everything else in DbAttribute (+DbExtra for later).
DATABASE STRUCTURE
MARVEL
MULTIPLE STORAGE-BACKEND SUPPORT
! AiiDA API decoupled from object-relational mapper
! Two ORM implemented (Django and SQL Alchemy)
! Flexible backend choice based on needs
! Easy incorporation of graph databases like Neo4J and Titan
!!"#"!$
$
MARVEL
WORKFLOWS AND TURNKEY SOLUTIONS
MARVEL
WORKFLOWS, WORKFUNCTIONS, WORKCHAINS
! !23456378"9:4;<=6=8<"3:68>7=?"7="3@<"+A7=923B&"423923B&"C2=8374="86DDB"B34:<E"FG"6EE7=?"B759D<"E<84:634:"34"<H7B37=?"C2=8374=B
! '<:76D"6=E"96:6DD<D"<H<82374="B2994:3D62=8@"D4=?":2==7=?"36B>B"4="3@<"F68>?:42=E
! /4=3:4D"9:4;<=6=8<"?:6=2D6:73GB34:<"D<;<D"4C"E<367D":<D<;6=3"34"3@<"I4:>CD4IB
! *:4?:<BB"8@<8>947=37=?:<B36:3"C:45"6:F73:6:G"B3<9&":<3:G"4="C67D2:<
! $6BG"E<F2??7=?&"B<DCJE4825<=37=?! '<65D<BB"57H7=?"4C"D486D"6=E":<543<"K4FB&"F68>?:42=E"<H<82374=E6<54="<H<82374="6DD4IB"568@7=<"34"F<"B@23E4I="6=E"84=37=2<"C:45"D6B3"947=3&"<BB<=376D"C4:":2==7=?"D4=?":<543<"K4FB
class PwBandsWorkChain(WorkChain):@classmethoddef define(cls, spec):
spec.input('codename', valid_type=Str)
spec.input('structure', valid_type=StructureData)
spec.input('protocol', valid_type=Str,default=Str(‘standard'))
spec.outline(cls.setup_protocol,cls.setup_structure,cls.setup_kpoints,cls.setup_pseudo_potentials,cls.setup_parameters,cls.run_relax,cls.run_seekpath,cls.run_scf,cls.run_bands,cls.run_results,
)
!!"#"!$
D
MARVEL
+!,L")M%.(*%$"/1+$'"%(0N$+
)*+
,-./0/1234536524781591:8;18:-E-.-;1:342;5F/4?9E5987-:;-..E5621124G5365@3?-.57/:/@-1-:9H
,-98.15/4/.C9295/4?:-91/:15365/431I-:5J=F:/4;I24G5.337
J8.127.-524?-7-4?-41F:/4;I-?5J=5KLA5:8496:3@5/5KL>5:84
+24/.5:-98.1
MARVEL
COD ( 7228 Li-containing )
CodImporter (no partial occ., no attached H ?)
cifs_from_COD_imported_rejected ( 3451 )
Ncifs_from_COD_imported_accepted ( 3777 )
Y
CifCleaner (Cleans and standardizes cif files)
cifs_cleaned_with_codtools_clean_cif ( 7731 )
Cif2Structure ( Parses cif files for structure) inline_standardize_from_cleaned_cif ( 7472 )
structures_standardized_from_cleaned_cif ( 7472 )
No duplicate?
structures_standardized_from_cleaned_cif_dup_filtered ( 4963 )
Ystructures_standardized_from_cleaned_cif_dup_rejected ( 2509 )
N
NiggliReduce ( get reduced (Niggli) structure ) inline_niggle_reduce
structures_standardized_from_cleaned_cif_dup_filtered_niggli_reduced ( 4963 )
CompositionFilter ( suitable composition ?)
structures_composition_filter_rejected ( 3534 )
Nstructures_composition_filter_accepted ( 1429 )
Y
AtomicDistanceFilter ( meaningful bond distances ?)
structures_atomic_distance_filter_rejected ( 62 )
Nstructures_atomic_distance_filter_accepted ( 1367 )
Y
IonicityFilter ( Enough anions for Li ?)
structures_ionicity_filter_accepted ( 1362 )
Ystructures_ionicity_filter_rejected ( 5 )
N
CalculateBands ( Relax the structure and calculate bandgap ) structures___calc_vc-relax_deg_0p02_kpts_dist_0p2_psfam_ssspv1p0eff_smear_cold_volthr_0p01 ( 348 )
FirstSCF ( One SCF-cycle to estimate occupations )
bands__calc_vc-relax_deg_0p02_kpts_dist_0p2_psfam_ssspv1p0eff_smear_cold_volthr_0p01 ( 350 )
BandGapFilter ( Does the relaxed structure have a bandgap ? )
structures_relaxed_bandgap_filter_accepted ( 284 )
Ystructures_relaxed_bandgap_filter_rejected ( 66 )
N
PrepareSupercell ( Prepare supercell of suitable dimensions ) inline_make_supercell_minimal_dimension_8 ( 284 )
supercell_minimal_dimension_8_rattle_sigma_0 ( 973 )
PrepareFlipperStructure ( Prepare the structures for the pinball ) delithiate_structure_inline_pinball_kind_symbol_Li ( 975 )
structure_flipper-compatible_pinball_kind_symbol_Li ( 973 ) structure_flipper-compatible_pinball_kind_symbol_Li-diffusion-failed-on-bellatrix ( 11 )
ChargeCalc ( Calculate the charge density w. impl. lithium ) chillstep_calculations_singlescf-on-delthiated ( 257 )
charge-densities-bellatrix ( 182 )
Fitting ( Find the coefficients for the flipper ) chillstep_fitting_random_displacements-divide_r2-False_is_local-True_nr_of_force_components-5000_stdev-0p1 ( 185 )
coefficients ( 182 )
Pinball dynamics
Diffusion ( 180 )
Icsd ( 8627 Li-containing )
IcsdImporter (no partial occ., no attached H ?)
cifs_from_ICSD_imported_rejected ( 4670 ) cifs_from_ICSD_imported_accepted ( 3956 )
structures-calc_scf-deg_0p02-kpts_dist_0p2-psfam_ssspv1p0eff-smear_cold ( 1009 )
OccupationFilter ( Are the occupations compatible with bandgap ? )
structures_non-relaxed_occupation_filter_accepted ( 734 )
Ystructures_non-relaxed_occupation_filter_rejected ( 235 )
N
Rattle (Shake atoms)inline_displace_atoms_sigma-0p1 ( 734 )
structures_non-relaxed_rattled_sigma_0p1 ( 734 )
VC-Relax-WF (relax atoms and cell) structures_calc_vc-relax_electrons_c_2e-11_energy_c_0p0001_force_c_5e-05_kpts_dist_0p2_pressure_c_0p5_psfam_ssspv1p0eff_volthr_0p01 ( 718 )
ChargeCalc ( Calculate the charge density w. impl. lithium )
charge-densities-deneb ( 11 )
Finished relaxations pw_calc_vc-relax_electrons_c_2e-11_energy_c_0p0001_force_c_5e-05_kpts_dist_0p2_pressure_c_0p5_psfam_ssspv1p0eff_volthr_0p01 ( 689 )
SupercellPrepare ( Prepare supercell of suitable dimensions ) inline_make_supercell_minimal_dimension_8_rattle_sigma_0 ( 689 )
COD ( 7228 Li-containing )
CodImporter (no partial occ., no attached H ?)
cifs_from_COD_imported_rejected ( 3451 )
Ncifs_from_COD_imported_accepted ( 3777 )
Y
CifCleaner (Cleans and standardizes cif files)
cifs_cleaned_with_codtools_clean_cif ( 7731 )
Cif2Structure ( Parses cif files for structure) inline_standardize_from_cleaned_cif ( 7472 )
structures_standardized_from_cleaned_cif ( 7472 )
No duplicate?
structures_standardized_from_cleaned_cif_dup_filtered ( 4963 )
Ystructures_standardized_from_cleaned_cif_dup_rejected ( 2509 )
N
NiggliReduce ( get reduced (Niggli) structure ) inline_niggle_reduce
structures_standardized_from_cleaned_cif_dup_filtered_niggli_reduced ( 4963 )
CompositionFilter ( suitable composition ?)
structures_composition_filter_rejected ( 3534 )
Nstructures_composition_filter_accepted ( 1429 )
Y
AtomicDistanceFilter ( meaningful bond distances ?)
structures_atomic_distance_filter_rejected ( 62 )
Nstructures_atomic_distance_filter_accepted ( 1367 )
Y
IonicityFilter ( Enough anions for Li ?)
structures_ionicity_filter_accepted ( 1362 )
Ystructures_ionicity_filter_rejected ( 5 )
N
CalculateBands ( Relax the structure and calculate bandgap ) structures___calc_vc-relax_deg_0p02_kpts_dist_0p2_psfam_ssspv1p0eff_smear_cold_volthr_0p01 ( 348 )
FirstSCF ( One SCF-cycle to estimate occupations )
bands__calc_vc-relax_deg_0p02_kpts_dist_0p2_psfam_ssspv1p0eff_smear_cold_volthr_0p01 ( 350 )
BandGapFilter ( Does the relaxed structure have a bandgap ? )
structures_relaxed_bandgap_filter_accepted ( 284 )
Ystructures_relaxed_bandgap_filter_rejected ( 66 )
N
PrepareSupercell ( Prepare supercell of suitable dimensions ) inline_make_supercell_minimal_dimension_8 ( 284 )
supercell_minimal_dimension_8_rattle_sigma_0 ( 973 )
PrepareFlipperStructure ( Prepare the structures for the pinball ) delithiate_structure_inline_pinball_kind_symbol_Li ( 975 )
structure_flipper-compatible_pinball_kind_symbol_Li ( 973 ) structure_flipper-compatible_pinball_kind_symbol_Li-diffusion-failed-on-bellatrix ( 11 )
ChargeCalc ( Calculate the charge density w. impl. lithium ) chillstep_calculations_singlescf-on-delthiated ( 257 )
charge-densities-bellatrix ( 182 )
Fitting ( Find the coefficients for the flipper ) chillstep_fitting_random_displacements-divide_r2-False_is_local-True_nr_of_force_components-5000_stdev-0p1 ( 185 )
coefficients ( 182 )
Pinball dynamics
Diffusion ( 180 )
Icsd ( 8627 Li-containing )
IcsdImporter (no partial occ., no attached H ?)
cifs_from_ICSD_imported_rejected ( 4670 ) cifs_from_ICSD_imported_accepted ( 3956 )
structures-calc_scf-deg_0p02-kpts_dist_0p2-psfam_ssspv1p0eff-smear_cold ( 1009 )
OccupationFilter ( Are the occupations compatible with bandgap ? )
structures_non-relaxed_occupation_filter_accepted ( 734 )
Ystructures_non-relaxed_occupation_filter_rejected ( 235 )
N
Rattle (Shake atoms)inline_displace_atoms_sigma-0p1 ( 734 )
structures_non-relaxed_rattled_sigma_0p1 ( 734 )
VC-Relax-WF (relax atoms and cell) structures_calc_vc-relax_electrons_c_2e-11_energy_c_0p0001_force_c_5e-05_kpts_dist_0p2_pressure_c_0p5_psfam_ssspv1p0eff_volthr_0p01 ( 718 )
ChargeCalc ( Calculate the charge density w. impl. lithium )
charge-densities-deneb ( 11 )
Finished relaxations pw_calc_vc-relax_electrons_c_2e-11_energy_c_0p0001_force_c_5e-05_kpts_dist_0p2_pressure_c_0p5_psfam_ssspv1p0eff_volthr_0p01 ( 689 )
SupercellPrepare ( Prepare supercell of suitable dimensions ) inline_make_supercell_minimal_dimension_8_rattle_sigma_0 ( 689 )
WORKFLOW ABSTRACTION
!!"#"!$
#
MARVEL
INTEROPERABILITY OF CODES
MARVEL
*%M,(0"(0O#!'.#M/.M#$
!!"#"!$
!<
MARVEL
O%$M#"(=3<:49<:6F7D73G"3@:42?@"!77+!OD<2:"B752D6374=B"86="F<"8@67=<E"I73@"43@<:"84E<B"3@:42?@"!77+!"'3:2832:<+636 =4E<BP"
'3:2832:<+636
OD<2:"'/O"IC OD<2:"$1'"IC OD<2:"A6=E"IC
>2M%
OD<2:"+1'"ICA6=E"IC OD<2:"+1'"
N Q4:>CD4I"RICS"*6:65<3<:+636
�
������������
����
OD<2:"'/O"IC OD<2:"$1'"OD<2:"$1'"IC OD<2:"A6=E"$1'"OD<2:"$1'"
O
H
3@:42?@"!77+!"
'3:2832:<+636
PAE5H
MARVEL
! Is published in the PyPI, can be integrated with AiiDA via:`pip install aiida-siesta`
! Has features implemented such as:
" Band structure calculation
" PDOS calculation
" STM imaging
! Contains workflows for band structure and STM imaging both based on top of standard WorkChain class for Siesta plugin.
! Is completely interoperable with other modules of AiiDA, since no plugin-specific node types are introduced during workflow executions. Example of graphene system band structure calculation carried out completely in
an interactive python environment with AiiDA + Siesta plugin.
'($'.!"(=3<:49<:6F7D73G"3@:42?@"!77+!
!!"#"!$
!!
MARVEL
QE-Yambo Interoperability through AiiDA
StructureData PwCalculation(SCF + NSCF)
YamboCalculation(p2y + GW)
Code-agnostic AiiDA datatypes (StructureData, KpointsData, BandsData and more) allow seamless interoperability between flagship codes in the AiiDA plugins and workflows ecosystem
Sharing Computing AnalysingStoring
After structural optimisation of monolayer MoS2 performed with either FLEUR or SIESTA or QE, the G0W0 band gap is calculated using QE+Yambo
RemoteDataKpointsData
BandsDataParametersData
YamboWorkflow
MARVEL
CURATION OF DATA
!!"#"!$
!%
MARVEL
SOME THOUGHTS ON DATA
! In computational science, data are naturally generated, so the workflows that create properties and data from a structure are key
! Curated data are needed (e.g. for verification or for machine learning)
! A model of data-on-demand can be implemented (high-throughput pushes the development of robust workflows to calculate automatically).
! Full provenance allows a-posteriori decoration of metadataA. Merkys et al., A posteriori metadata from automated provenance tracking: Integration of AiiDA and TCOD, Journal of Cheminformatics, in press (2017).
MARVEL
MATERIALS REPOSITORIES
!!"#"!$
!&
MARVEL
SERVICES TO THE COMMUNITYSCALABILITY OF EFFORTS
MARVEL
SOME CONSIDERATIONS ON HTC
1) The need for massive high-throughput calculationspushes us to develop robust “turnkey” solutions forpredicting materials properties
2) Such effort automatically makes it possible to offercore capabilities open to the community at large –fellow computational scientists, experimental groups,national laboratories, companies
!!"#"!$
!'
MARVEL
MOVING TO THE CLOUD
Computer centers are moving from HPC only to service providers
Services needed in federated supercomputer centres:
! Database (to store and query information)" PostgreSQL 9.5 supporting data intensive queries, JSON and multiple users
! Object store (to store large files)" Apache Swift: Efficient storage and retrieval of large objects
! Web backends (hosting of web services)" Apache: Discovery, exploration of existing materials, calculations, workflows & launch
of new ones
! AAI (authentication and authorization infrastructure)" In progress: Keystone, Shibboleth, identity management and authentication for
federated access
MARVEL
.T$")!.$#(!%'"/%1M+"*%!.O1#)
!!"#"!$
!(
MARVEL
'$#U(/$'-"!M.1)!.$+"'*$/.#!"R$%$/.#10'&"*T1010'S
MARVEL
AN EXAMPLE: MATERIALS DISCOVERY
11/9/17
16
COMPUTATIONAL EXFOLIATION OF ALL KNOWN INORGANIC MATERIALS ALL
KNOWN INORGANIC MATERIALS
MARVEL
HOW DO WE PRODUCE 2D MATERIALS?
Mechanical (e.g. Geim/Novoselov, fig. from Nature/NUS)...
…or liquid exfoliation (e.g. Nicolosi/Coleman, fig. from Science)Also, bottom-up: CVD and wet chemical synthesis
!!"#"!$
!$
MARVEL
HIGH-THROUGHPUT COMPUTATIONAL EXFOLIATION
1. Identification of layered materials among known experimental compounds
2. Automatic calculation of all properties (structure/electronic/magnetic)
3. Binding energies and shear elastic constants ! can be exfoliated
4. Testing for mechanical, thermodynamical, electro/chemical stability
MARVEL
$H3<:=6D"E636F6B<B"R(/'+&"/1+S
%6G<:<E"563<:76DB
$HC4D76FD<V)<8@6=786DDG"
B36FD<V
*:49<:37<BV
)6?=<378"?:42=E"B363<V
Q82.?24G5345/57:-R23895918?C5365#%51S3T?2@-49234/.5;3@7384?9UVW5X-FYG8- -15/.WE5N,Z5[%<!&\
HIGH-THROUGHPUT COMPUTATIONAL EXFOLIATION
!!"#"!$
!D
MARVEL
!"O$Q"$W!)*%$'
MARVEL
%$.X'"'.!#."Q(.T"U1A:Y
!!"#"!$
!#
MARVEL
O#1)"(/'+".1"!"Q1#N(0,"'.#M/.M#$
Primitive cell & structure sym. refinement
MARVEL
3D RELAXATION
Lowdimfinder on relaxed structure
Pw calculations
PwWorkflow
!!"#"!$
%<
MARVEL
)!,0$.(/"'/#$$0(0,"1O".T$"Y+")101%!Z$#
ChronosWorkflow
MARVEL
#$)1U(0,")$/T!0(/!%"(0'.!A(%(.($'
Stabilization procedureStabilization procedure
Γ-phonon calculation
Displace the atoms along the unstable eigenvectors
Final vc-relax
!!"#"!$
%!
MARVEL
FINALLY…
MARVEL
NOVEL EXFOLIABLE MATERIALS
At least 1800 structures are below this line.
Three groups:
! Eb < 30 meV/Å2 (DF2-C09) or Eb < 35 meV/Å2 (rVV10) ! 2D, easily exfoliable
! Eb > 130 meV/Å2 ! not 2D (discarded)
! In-between] 2D, potentially exfoliable
1053 monolayers
791 monolayers
Mounet et al., arXiv:1611.05234 (2016)
!!"#"!$
%%
MARVEL Mounet et al., arXiv:1611.05234 (2016)
2D PROTOTYPES
MARVEL
OPTIMAL MATERIALS FOR ELECTRONICS
A. Kis (EPFL)
!!"#"!$
%&
MARVEL
2D TOPOLOGICAL
� X �
�0.4
�0.2
0.0
0.2
0.4
Energy[eV]
Z2 topological insulators (9 in ~1000; 4 new, +3 under strain )
MARVEL
AiiDA THANKS
Giovanni Pizzi
(EPFL)
AndriusMerkys(Vilnius)
Nicolas Mounet(EPFL)
Boris Kozinsky(BOSCH)
MartinUhrin(EPFL)
SpyrosZoupanos
(EPFL)
Nicola Marzari(EPFL)
Snehal P.Kumbhar
(EPFL)
LeonidKahle(EPFL)
FernandoGargiulo
(EPFL)
RicoHäuselmann
(EPFL)
SebastiaanP. Huber
(EPFL)
Andrea Cepellotti
(EPFL)
MarcoBorelli(EPFL)
ElsaPassaro(EPFL)
ThomasSchulthess
(ETHZ,CSCS)
LeopoldTalirz(EPFL)
OleSchütt(EMPA)
!!"#"!$
%'
MARVEL
http://emmc.info
http://nffa.eu
Swiss National Centre for Computational Design and Discovery of Novel Materials
H2020 Centre of Excellence MaX:Materials Design at the Exascale
H2020 Nanoscience Foundries and Fine AnalysisH2020 European Materials Modelling Council
H2020 Graphene FlagshipH2020 Marketplace
H2020 Marie-Curie CofundMax-Planck-EPFL Centre
PASCVarinor
ConstelliumRobert Bosch RTC
http://nccr-marvel.ch
http://max-centre.eu
FUNDING
Recommended