20
GSC2 Maintenance GSC2 Annual meeting 2001

GSC2 Maintenance GSC2 Annual meeting 2001. Database administrative tasks Database production tasks Identification and correction of errors Processing

Embed Size (px)

Citation preview

Page 1: GSC2 Maintenance GSC2 Annual meeting 2001. Database administrative tasks Database production tasks Identification and correction of errors Processing

GSC2 MaintenanceGSC2 Annual meeting 2001

Page 2: GSC2 Maintenance GSC2 Annual meeting 2001. Database administrative tasks Database production tasks Identification and correction of errors Processing

• Database administrative tasks

• Database production tasks

• Identification and correction of errors

• Identification and correction of errors

• Processing Statistics and where do we go?

Page 3: GSC2 Maintenance GSC2 Annual meeting 2001. Database administrative tasks Database production tasks Identification and correction of errors Processing

Database Administrative tasks

• System and Database upgrades• System upgrade to Windows 2000

• Hard disk storage increased to 4TB RAID

• Objectivity 6.1

latest version has browsing and object manipulation improvements as well as greatly enhanced the transaction cleanup time.

New version expected soon : PYTHON BINDING for Objectivity will be supported.

Page 4: GSC2 Maintenance GSC2 Annual meeting 2001. Database administrative tasks Database production tasks Identification and correction of errors Processing

DB Admin. (cont)

• Database file migration into the new disk storage– all 32768 files required re-registration in addition to migrating

into the disk storage.

– file access error in migration. Vendor provided solution implemented and all files registered.

– Reassessed the distribution of files within the RAID system based on previous problems with disk space.

– We had implemented an ad hoc fix to highly non-homogeneous distribution of objects with respect to the HTM.

– Some disks nearly empty and others nearly full!

Page 5: GSC2 Maintenance GSC2 Annual meeting 2001. Database administrative tasks Database production tasks Identification and correction of errors Processing
Page 6: GSC2 Maintenance GSC2 Annual meeting 2001. Database administrative tasks Database production tasks Identification and correction of errors Processing

N0

Page 7: GSC2 Maintenance GSC2 Annual meeting 2001. Database administrative tasks Database production tasks Identification and correction of errors Processing

N0

S3

N1

N2

N3

S0

S1

S2

N11

N33N32N31N30

N23N22N21N20

N13N10

N03N01 N02N00

N12

S03S02S01S00

S12S11S10 S13

S23S22S21S20

S33S32S31S30

J: N00 N01K: N02L: N03M: N10 N11 N12 N13N: N20 N21 N22 N23O: N30 N31P: N32 N33Q: S00 S01 S02 S03R: S10S: S11T: S12 S22U: S20 S21V: S22W: S23X: S30Y: S31 S32 S33

WHAT THIS MEANS IS WE NOW HAVE 16 DRIVES EACH AT ABOUT 4% TO 8% OF CAPACITY. PLENTY OF ROOM TO GROW AND EFFICIENT OPERATIONS

Page 8: GSC2 Maintenance GSC2 Annual meeting 2001. Database administrative tasks Database production tasks Identification and correction of errors Processing

Database production tasks

• All tasks currently in production required to be recompiled and rebuilt.

• Integration of PYTHON scripts into day to day production.

• Insertion of reference catalogs into database.• Streamlining administrative tasks and integration

into production tasks.• Porting photosol and new classification tasks into

windows 2000 environment.

Page 9: GSC2 Maintenance GSC2 Annual meeting 2001. Database administrative tasks Database production tasks Identification and correction of errors Processing

Identification and correction of errors• Complexity • Ten’s of thousands of lines of code (C, FORTRAN, perl, idl, dcl, C++…) • 3 operating systems• Nearly 1 billion objects• Greater then 3600 unique photographic plates

response, uniformity of the glass under stress, physics and chemistry of the emulsion and the manufacturing process…

• A great number of factors associated with observational astronomy.seeing, atmospheric transmission, temperature, extinction, telescope tracking, image quality…

Page 10: GSC2 Maintenance GSC2 Annual meeting 2001. Database administrative tasks Database production tasks Identification and correction of errors Processing

Identification and correction of errors

• Database errors– Corruption– Referential integrity

Page 11: GSC2 Maintenance GSC2 Annual meeting 2001. Database administrative tasks Database production tasks Identification and correction of errors Processing

Database errorscorruption and referential integrity

• Effected small amount of data on the order of fractions of a percent. Complete rebuild of 4 or 5 databases for gsc2.2 delivery due to corruption.

• Corruption is most serious due to the unknown nature of how it occurs

• Especially difficult due to the hands off large scale production efforts we employ with tasks running for weeks on end over large datasets and complicated production tasks.

• Vender request to help in order so they can better understand

• Reference integrity fundamental to our project .

• likely cause is concurrent access between database applications and possibly between applications and administrative tasks.

• We do have some utilities to check for zero reference objects and to statically look at various ratios of 1,2,…,n references.

• Complications due to various factors (primarily the complicated nature of the plate overlap regions). Use of this tool is mainly as a result of some other additional indication of problems.

• The extent of this problem is again fairly small with only a handful of databases requiring un-matching and re-matching.

Page 12: GSC2 Maintenance GSC2 Annual meeting 2001. Database administrative tasks Database production tasks Identification and correction of errors Processing

Database errors

• Matching integrity

• Clearly visible on the sky maps.

• Can be result of various causes.

• in cases where it only occurs in overlap region could be the result database timing and file access problems as well as astrometric problems (astrometry tasks are very robust and the reduction has been very uniform for the 2nd epoch surveys).

• The identification and correction of matching problems is made more difficult due to the difference between the plate based matching and the region based database.

Page 13: GSC2 Maintenance GSC2 Annual meeting 2001. Database administrative tasks Database production tasks Identification and correction of errors Processing

15 h 12 h

+20

+30

507508

443442

N322N321

In addition to J and F POSSII fields 507, 508, 442, 443 we have the IV N fields 507 And 508 which do not have magnitude selected limits imposed in the export task.The quick V fields N321 and N322 are loaded as well. All the matching is reasonably well done in the plate centers.

N

Page 14: GSC2 Maintenance GSC2 Annual meeting 2001. Database administrative tasks Database production tasks Identification and correction of errors Processing

Take a closer look in the North east corner of field 507

Bright stars:

1. Entry with F, J, N, V

2. 2nd entry with F, N

3. 3rd entry with V

As well as various entriesthat result from lack of a magnitude limit for V andN.

Page 15: GSC2 Maintenance GSC2 Annual meeting 2001. Database administrative tasks Database production tasks Identification and correction of errors Processing

So ? Whatever happened to create this must have been a fairly

complicated sequence of events.

• Clearly for some reason all the plates matched well at the center of the field but not around the edges. Evidence suggests that this is NOT an astrometric problem.

• Completely un match all the plates in the region.

• Verify the astrometric and photometric solutions for the IV N and quick V plates (I am assuming that there is a reason to include these fields despite the fact they really do not belong in this release) .

• Re-match all the regions on all the plates again.

• Re-export all those regions and check.

Page 16: GSC2 Maintenance GSC2 Annual meeting 2001. Database administrative tasks Database production tasks Identification and correction of errors Processing

What is the point?• No single method or task was able to detect and explain why this

region looked so different. • Visualization tools: skycat showsky, fitsview and IDL were all used

but care must be exercised as different problems (photometric error) could produce similar results.

• Easy to generate global statistics like matching ratios and object index counts… only gave very qualitative indications that are hard to interpret.

• The cause of the problem remains unknown.• On the order of 50 fields or 100 plates may be affected in a similar

fashion requiring re-matching. The difficulty is in the identification due to the complicated plate overlaps.

• The data is completely fixable! And this allows us to focus on the important issues. (science, calibration updates, loading and matching new surveys…)

Page 17: GSC2 Maintenance GSC2 Annual meeting 2001. Database administrative tasks Database production tasks Identification and correction of errors Processing

Photometric Errors

• Fairly easy to identify and fix if they are large.

• From the plate maps it is fairly obvious that the plate to plate consistency is fairly good.

• Several cases found to occur in fields without good sequences deeper then GSPC1 (14th to 15th mag).

Page 18: GSC2 Maintenance GSC2 Annual meeting 2001. Database administrative tasks Database production tasks Identification and correction of errors Processing

Field 219 north

Examination of J-F showed decent agreement toaround 14th mag. Diverged at fainter magnitudeto around mean J-F near 18 was about 2.5 to 3.0.Also turned out that the cause of the error was an isolated procedural error and an acceptable calibration had been performed but had failedto be applied properly.

Page 19: GSC2 Maintenance GSC2 Annual meeting 2001. Database administrative tasks Database production tasks Identification and correction of errors Processing

Catalog processing• Southern IVN (IS) 44 % complete

• North IVN (XI) 35% complete

• POSS1 E 100% complete• Small number have been loaded and matched.

• Infrared surveys could be completed in relatively short time if concerted effort was made.

• May have to make some hard decisions in light of staffing and resource issues.

Page 20: GSC2 Maintenance GSC2 Annual meeting 2001. Database administrative tasks Database production tasks Identification and correction of errors Processing

Conclusions1. Database maintenance continues to be a high priority issue prior to

proceeding with large scale operations such as loading and matching new surveys.

2. Data quality and integrity is being addressed but may not be incorporated into the export catalog on a fix by fix basis. High priority requests could be accommodated to some degree.

3. Estimates on the amount of data that have been compromised (primarily in the matching integrity) are at less then 10%.

4. We can identify and fix the vast majority of problems.

5. Data processing and future enhancements to the GSC2 are both planned and proceeding.

6. We are grateful and deeply indebted to those working in collaboration with us to help in the analysis and better understand this massive dataset.