123

Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods
Page 2: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

Contents

Instruction ....................................................................................................................................... 1 Program ........................................................................................................................................... 2

Abstracts

On Gradient-Based Optimization: Accelerated, Stochastic and Nonconvex Michael I. Jordan .... 4 Value of Information Methods Le Bao ............................................................................................ 5 TBA Jian Cao ..................................................................................................................................... 6 Optimal covariance matrix estimation for high-dimensional noise in high-frequency data Jinyuan Chang ...................................................................................................................................... 7 Multivariate network meta-analysis made simple Yong Chen ........................................................ 8 Functional Canonical Correlation and Functional Prediction Di-Rong Chen ............................... 9 A Power One Test for Unit Roots Based on Sample Autocovariances Guanghui Cheng .............. 10 METRIC LEARNING VIA CROSS-VALIDATION Linlin Dai .................................................. 11 Dynamic Change Detection with False Discovery Rate Control Lilun Du .................................. 12 Valuing commodity options and futures options with changing economic conditions Kun Fan ............................................................................................................................................ 13 An extended Mallows model for ranked data aggregation Xiaodan Fan ..................................... 14 Some challenges in analyzing big data in health and medical research Bo Fu .......................... 15 Estimating Truncated Functional Linear Models with a Nested Group Bridge Approach Tianyu Guan ....................................................................................................................................... 16 Modeling Traffic Crash Risk Feng Guo ......................................................................................... 17 Moderate-Dimensional Inferences on Quadratic Functionals in Ordinary Least Squares Xiao Guo ........................................................................................................................................... 18 Local Inference in Additive Models with Decorrelated Local Linear Estimator Zijian Guo ..... 19 Nonlocal online RPCA for video denoising Zhi Han ..................................................................... 20 Oracle P-value and Variable Screening Ning Hao ........................................................................ 21 Inference in a mixture additive hazards cure model Haijin He ................................................... 22 The Pearson Correlation Between Tree-Shaped Data Sets: Estimating, Graphical

Representation and Hypothesis Testing Jie Hu ............................................................................ 23 AI-Based Solution for Financial Risk Assessment and Fraud Detection Ling Huang ................ 24 Causal mediation of semicompeting risks Yen-Tsung Huang .......................................................... 25 Multiple Imputation on enhanced model identification for nonignorable nonresponse Jongho Im .......................................................................................................................................... 26 Generalized Four Moment Theorem and an Application to CLT for Spiked Eigenvalues of

Large-dimensional Covariance Matrices Dandan Jiang ................................................................ 27

Functional-coefficient regression models with GARCH errors Jiancheng Jiang ......................... 28

Page 3: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

Prediction of hospital readmission frailties with misspecified shared frailty models Xuejun Jiang ...................................................................................................................................... 29 The Operating Principle of Regularized Spectral Clustering Donggyu Kim ............................... 30 Discrepancy between global and local principal component analysis on large-panel

high-frequency data Xin-Bing Kong ................................................................................................. 31 Optimal Estimation of Wasserstein Distance on Trees with An Application to Microbiome

Studies Hongzhe Li ........................................................................................................................... 32 A supplement to Jiang's asymptotic distribution of the largest entry of a sample correlation

matrix Deli Li .................................................................................................................................. 33 High-Dimensional Vector Autoregressive Time Series Modeling via Tensor Decomposition Guodong Li ........................................................................................................................................ 34 Tensor Analysis and Neuroimaging Applications Lexin Li ........................................................... 35 Statistical Learning for Personalized Wealth Management Yingying Li ..................................... 36 Mediation analysis for zero-inflated mediators Zhigang Li .......................................................... 37 Identiability and Non-Convex Algorithm for Multi-Channel Blind Deconvolution Song Li .... 38 A non-randomized multiple testing procedure for large-scale heterogeneous discrete

hypotheses based on randomized tests Nan Lin ............................................................................ 39 Deep Neural Networks for Rotation-Invariance Approximation and Learning Shao-Bo Lin ....................................................................................................................................... 40 A Quantile Association-based Variable Selection Yuanyuan Lin ................................................... 41 Some Statistical Methods for Single-cell Genomics Zhixiang Lin ................................................. 42 Weighted multiple-quantile classifiers for functional data with application in multiple

sclerosis screening Catherine Liu ..................................................................................................... 43 Optimal Covariance Matrix Estimation for High-dimensional Noise in High-frequency Data Cheng Liu .......................................................................................................................................... 44 Data-adaptive Kernel Support Vector Machine Xin Liu .............................................................. 45 Testing of covariate effects under ridge regression for high-dimensional data Xu Liu ............. 46 Towards Software-Defined Infrastructure for Decentralized Data Governance Xuanzhe Liu .. 47 Distributed learning from multiple EHR databases: Contextual embedding models for

predicting medical events Qi Long ................................................................................................. 48 Wavelet Empirical Likelihood Estimator for Stationary and Locally Stationary Long

Memory Processes Zhiping Lu ......................................................................................................... 49

GMV Prediction Using Driver Preference Shikai Luo .................................................................. 50 A Nonparametric Bayesian Approach to Simultaneous Subject and Cell Heterogeneity

Discovery for Single Cell RNA-Seq Data Xiangyu Luo .................................................................. 51 A Versatile Estimation Procedure without Estimating the Nonignorable Missingness

Mechanism Yanyuan Ma ................................................................................................................... 52 Matrix Completion under Low-Rank Missing Mechanism Xiaojun Mao .................................... 53 A mean field theory of two-layers neural networks Song Mei ..................................................... 54 A Dynamic Additive and Multiplicative Effects Model with Application to the United Nations

Voting Behaviours Xiaoyue Niu ....................................................................................................... 55

Page 4: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

A Super Scalable Algorithm for Short Segment Detection Yue Niu ............................................ 56 Improved doubly robust estimation in learning individualized treatment rules Yinghao Pan .. 57 Predicting terrorist events: opportunities and challenges Andre Python ..................................... 58 On the ‘Off-Label’ Use of Data Normalization for Sample Classification and Prognostication Li-Xuan Qin ....................................................................................................................................... 59 Adaptive Minimax Density Estimation for Huber’s Contamination Model under

$L_p$ Losses Zhao Ren ................................................................................................................... 60 Dynamic Spatial Panel Data Models with Endogeneity and Common Factors Wei Shi ........... 61 Bridging the gap between noisy healthcare data and knowledge: automated translation of

medical terminology Xu Shi ............................................................................................................ 62 Estimating the sample mean and standard deviation from the five-number summary and

their applications in evidence-based medicine Tiejun Tong........................................................... 63 An efficient ADMM algorithm for high dimensional precision matrix estimation via

penalized quadratic loss Cheng Wang ............................................................................................. 64 MOSUM-based test and estimation method for multiple changes in panel data Man Wang .... 65 Structured tensor decomposition and its application Miaoyan Wang ........................................... 66 Identification of the number of factors for factor modeling in high dimensional time series Qinwen Wang ..................................................................................................................................... 67 Large Multiple Graphical Model Inference via Bootstrap Shaoli Wang...................................... 68 An adaptive independence test for microbiome community data Tao Wang .............................. 69 Integrated Quantile Rank Test (iQRAT) for heterogeneous joint effect of rare and common

variants in sequencing studies Tianying Wang ................................................................................ 70 Model Free Approach to Quantifying the Proportion of Treatment Effect Explained by a

Surrogate Marker Xuan Wang ........................................................................................................ 71 Scattering Transform and Stylometry Analysis in Arts Yang Wang ............................................. 72 A Fast and Practical Randomized Method for Low-Rank Tensor Approximations Yao Wang . 73 The traps that must be encountered in machine learning practices Hu Wei .............................. 74 Flexible Experimental Designs for Valid Single-cell RNA-sequencing Experiments Allowing

Batch Effects Correction Yingying Wei ............................................................................................ 75 Recentdevelopmentsingraphmatching:statisticalanalysis Yihong Wu .......................................... 76 Differential Markov Random Field Analysis Yin Xia ................................................................... 77 Building a Translational Research Program in Neuroinflammation: A Data Driven Approach

to Advance Precision Medicine for Multiple Sclerosis Zongqi Xia ............................................... 78 Realized volatility forecasting with HAR-GARCH type model: a Bayesian approach Han Xiang .......................................................................................................................................... 79 TBA Han Xiao .................................................................................................................................. 80 Multiple Testing Embedded in an Aggregation Tree to Identify where Two Distributions

Differ Jichun Xie ............................................................................................................................... 81 Pearson's statistics: approximation theory and beyond Mengyu Xu .................................... 82 Distribution and correlation free two-sample test of high-dimensional means Kaijie Xue ........ 83 RobustEstimatesandGenerativeAdversarialNetworks Yuan Yao................................................. 84

Page 5: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

A statistical and machine learning framework for new energy vehicle ride sharing system Kaixian Yu ......................................................................................................................................... 85 Word Segmentation and Term Discovery in Chinese Electronic Medical Records Using

Graph Theory and Deep Learning Sheng Yu ................................................................................. 86 Statistical Inference Based on Sufficient Dimension Reduction Zhou Yu ................................... 87 Global Optimality of Stochastic Semi-definite Optimization with Application to Ordinal

Embedding Jinshan Zeng ................................................................................................................. 88 High-dimensional Tensor Regression Analysis Anru Zhang .......................................................... 89 Enhanced Pulmonary Nodule Detection Using Fully Automated Deep Learning: A

Multifactor Investigation Chi Zhang .............................................................................................. 90 Heteroscedasticity test based on high-frequency data with jumps and microstructure noise Chuanhai Zhang ................................................................................................................................. 91 Factor Models for High-Dimensional Tensor Time Series Cun-Hui Zhang .................................. 92 Stochastic differential reinsurance games with capital injections Nan Zhang ............................ 93 Structured sparse logistic regression with application to lung cancer prediction using breath

volatile biomarkers Xiaochen Zhang ................................................................................................ 94 Simulated Distribution Based Learning for Non-regular and Regular Statistical Inferences Zhengjun Zhang .................................................................................................................................. 95 Estimation and inference for the indirect effect in high-dimensional linear mediation models Sihai Dave Zhao.................................................................................................................................. 96 Factor Modeling for Volatility Xinghua Zheng ................................................................................ 97 Sequential scaled sparse factor regression Zemin Zheng ............................................................... 98 Estimating Endogenous Treatment Effect Using High-Dimensional Instruments with an

Application to the Olympic Effect Wei Zhong ................................................................................ 99 Approximation Theory of Deep Convolutional Neural Networks Ding-Xuan Zhou .................. 100 Global Convergence of EM Harrison H. Zhou ............................................................................... 101 R package for new normality test Maoyuan Zhou .................................................................................... 102 GD-RDA: A new regularized discriminant analysis for high dimensional data Yan Zhou ...... 103 Matrix Completion for Network Analysis Ji Zhu ....................................................................... 104 Quantile double autoregression Qianqian Zhu ............................................................................. 105 A Boosting Algorithm for Estimating Generalized Propensity Scores with Continuous

Treatments Yeying Zhu ................................................................................................................... 106 Safe machine learning for safe genome editing James Zou ......................................................... 107 List of Participants ...................................................................................................................... 108 List of Participants (ZJU) ........................................................................................................... 112 Maps ............................................................................................................................................. 115

Page 6: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

1

2019 杭州数据科学前沿国际研讨会

2019 Hangzhou International Conference on Frontiers of Data Science

May 26-27, 2019

Jinxi Hotel, Hangzhou

杭州金溪山庄

Programming Committee Tony CAI University of Pennsylvania (USA) Tianxi CAI Harvard University (USA) Qi-Man SHAO Southern University of Science and Technology

(China) Yazhen WANG University of Wisconsin-Madison (USA,Chair)

Ming YUAN Columbia Univeristy (USA) Heping ZHANG Yale University (USA) Local Organizing Committee Xi CHEN Zhejiang University (Chair) Zhonggen SU Zhejiang University Jianwei YIN Zhejiang University Lixin ZHANG Zhejiang University Rongmao ZHANG Zhejiang University

Contact: Weina Su, [email protected], 0571-88208268

Organizer:

Center for Data Science, Zhejiang University Co-sponsor:

Zhejiang Association of Internet Finance, International Business School of Zhejiang University, International Research Center for Data Analytics and Management, Zhejiang University Official Website:http://cds.zju.edu.cn

Page 7: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

2

Program

May 26 08:30~09:10 Opening Ceremony Jinxi Hall Chair: Xi Chen

09:10~10:00 Keynote Speech Michael I. Jordan

Title:On Gradient-Based Optimization:

Accelerated, Stochastic and Nonconvex

Jinxi Hall Chair: Tony Cai

10:00~10:20 Tea Break

10:20~12:00

Recent advance in

complex data analysis

Learning Theory and

Compressed Sensing

The advancement in

statistical methods for

biological data

analysis

Data Science in

Healthcare

No.1 Hall Chair: Zijian Guo

Organizer: Zijian Guo

No.2 Hall Chair: Song Li

Organizer: Junhong Lin

No.3 Hall Chair: Jie Hu

Organizer: Hangjin Jiang

Sightseeing Hall Chair: Wei Luo

Organizer: Tianxi Cai

Speakers: Cun-Hui Zhang Harrison Zhou

Han Xiao Yihong Wu

Speakers: Yang Wang

Ding-xuan Zhou Di-rong Chen

Song Li

Speakers: Xiaodan Fan

Yingying Wei Xianghyu Luo

Jie Hu

Speakers: Xu Shi

Zongqi Xia Sheng Yu

Xiaoyue Niu

Lunch

13:30~15:10

Translational

Biomedical Data

Science

Recent development on

data analysis

Recent Developments

in Applied Statistical

methods

Recent development

in complex data

analysis

No.1 Hall Chair: Xuan Wang

Organizer: Tianxi Cai

No.2 Hall Chair: Donggyu Kim

Organizer: Donggyu Kim

No.3 Hall Chair: Zhixiang Lin

Organizer: Yuanyuan Lin

Sightseeing Hall Chair: Jinyuan Chang

Organizer: Jinyuan Chang

Speakers: Xuan Wang

Le Bao Yanyuan Ma

Yen-Tsung Huang

Speakers: Donggyu Kim

Hyo Young Choi Jongho Im

Maoyuan Zhou

Speakers: Xu Liu

Zhixiang Lin Linlin Dai

Yuanyuan Lin

Speakers: Zemin Zheng

Guanghui Cheng Jinyuan Chang

15:10~15:30 Tea Break

15:30~17:10

Advances in

Statistical Learning

Real-world Application

of AI & Service

Computing

Time series analysis

Advanced Methods

for Analysis of

Modern Biomedical

Data

No.1 Hall Chair: Yin Xia

Organizer: Tianxi Cai

No.2 Hall Chair: Renjun Xu

Organizer: Renjun Xu& Xiaoye Miao

No.3 Hall Chair: Rongmao Zhang Organizer: Xinyu Song

& Yazhen Wang

Sightseeing Hall Chair: Qi Long

Organizer: Qi Long

Speakers:

Dave Zhao Yin Xia

Zijian Guo Ji Zhu

Speakers: Xuanzhe Liu Ling Huang

Hu Wei Andre Python

Speakers: Qianqian Zhu

Chuanhai Zhang Xiao Guo

Zhiping Lu

Speakers: Yinghao Pan Yong Chen

Xiaochen Zhang Qi Long

Page 8: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

3

May 27

08:30~10:10

Advances in High Dimensional Models:

Beyond Scalar-on-vector

Regressions

Complex model

inference

Machine Learning in

Integration of

Large-Scale Genomics

Data

Machine Learning

Theory and

Application

No.1 Hall Chair: Junhong Lin Organizer: Emma

Jingfei Zhang

No.2 Hall Chair:Tianxiao Pang

Organizer: Yazhen Wang

No.3 Hall Chair: Hongzhe Li

Organizer: Hongzhe Li

Sightseeing Hall Chair: Harrison Zhou Organizer: Harrison

Zhou Speakers:

Lexin Li Jiguo Cao Tao Wang

Speakers: Zhengjun Zhang

Nan Lin Bo Fu

Anru Zhang

Speakers: Jichun Xie James Zou Hongzhe Li

Speakers: Song Mei Zhao Ren Yuan Yao

10:10~10:30 Tea Break

10:30~12:10

Financial

Econometrics

Harness the Power of

Complicated data

New methods and perspective for

analysis of multiple data sources

Data science in

genetics

No.1 Hall Chair: Hangjin Jiang Organizer: Yazhen

Wang

No.2 Hall Chair: Ning Hao

Organizer: Jiancheng Jiang

No.3 Hall Chair: Catherine Liu Organizer: Catherine

Liu

Sightseeing Hall Chair: Wei Luo

Organizer: Ying Wei

Speakers: Yingying Li

Xinghua Zheng Xinbing Kong Qinwen Wang

Speakers: Yue Niu

Ning Hao Jiancheng Jiang Dandan Jiang

Speakers: Lilun Du

Guodong Li Catherine Liu Tiejun Tong

Speakers: Lixuan Qin

Tianying Wang Zhigang Li

Xin Liu Lunch

13:30~15:10

Modern statistics:

theory and methods

Learning

high-dimensional data

with complex structure

Machine learning

Financial

Stochastics and

Statistics

No.1 Hall Chair: Weidong Liu Organizer: Weidong

Liu

No.2 Hall Chair: Miaoyan Wang Organizer: Miaoyan

Wang

No.3 Hall Chair: Yao Wang

Organizer: Yao Wang

Sightseeing Hall Chair: Zhonggen Su Organizer: Yazhen

Wang& Changliang Zou Speakers:

Xiaojun Mao Zhou Yu

Wei Zhong

Speakers: Chi Zhang Wei Shi

Mengyu Xu Miaoyan Wang

Speakers: Shaobo Lin

Jinshan Zeng Zhi Han

Yao Wang

Speakers: Kun Fan

Nan Zhang Cheng Liu

15:10~15:30 Tea Break

15:30~17:10

Recent Advances in

Complex Data

Analysis

Statistical and Machine

Learning Methods with

Application in AI

transportation

Recent developments

in statistical and

probability learnings

Service Computing

& Big Data Analysis

No.1 Hall Chair: Man Wang

Organizer: Xinyuan Song

No.2 Hall Chair: Renjun Xu

Organizer: Rui Song

No.3 Hall Chair: Wei Luo

Organizer: Wei Luo

Sightseeing Hall Chair: Xiaoye Miao Organizer: Xiaoye Miao

Speakers: Haijin He

Xuejun Jiang Han Xiang

Speakers: Shikai Luo Kaixian Yu Feng Guo

Speakers: Shaoli Wang Yeying Zhu

Deli Li

Speakers: Jian Cao

Kaijie Xue

Page 9: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

4

On Gradient-Based Optimization: Accelerated, Stochastic

and Nonconvex

Michael I. Jordan1

1 University of California at Berkeley, USA

E-mail: [email protected]

Abstract: Optimization methods play a key enabling role in statistical inference, both frequentist and Bayesian. Moreover, as statistics begins to more fully embrace computation, what is often meant by "computation" is in fact "optimization". I will discuss some recent progress in high-dimensional, large-scale optimization, where new theory and algorithms have provided non-asymptotic rates, sharp dimension dependence, elegant ties to geometry and practical relevance. In particular, I discuss several recent results: (1) a new framework for understanding Nesterov acceleration, obtained by taking a continuous-time, agrangian/Hamiltonian/symplectic perspective, (2) a discussion of how to escape saddle points efficiently in nonconvex optimization, and (3) the acceleration of Langevin diffusion.

Page 10: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

5

Value of Information Methods

Jacob Parsons, Le Bao*

*Department of Statistics, The Pennsylvania State University, USA

E-mail: [email protected]

Abstract: We develop new value of information methods to apply to the problems of outlier detection and influence analysis. The proposed method has a distinct advantage in flexibility and interpretability when compared to existing methods. We study the theoretical properties of three value of information quantities, establish the relationship between the proposed measures and classic measures, and illustrate our proposed approach in the case of a generalized linear mixed model for studying HIV epidemics. Key Words: Influence; Outlier; Bayesian

Page 11: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

6

TBA

Jian Cao1

1 Shanghai Jiaotong University, China

E-mail: [email protected]

Abstract: TBA

Page 12: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

7

Optimal covariance matrix estimation for high-dimensional

noise in high-frequency data

Jinyuan Chang1

1 Southwestern University of Finance and Economics, China

E-mail: [email protected]

Abstract: TBA

Page 13: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

8

Multivariate network meta-analysis made simple

Yong Chen1

1 University of Pennsylvania, 423 Guardian Dr, Blockley 602, USA

E-mail: [email protected]

Abstract: Due to patient's heterogeneous response to treatment, there is a growing interest in developing novel and efficient statistical methods in estimating individualized treatment rules (ITRs). The central idea is to recommend treatment according to patient characteristics, and the optimal ITR is the one that maximizes the expected clinical outcome if followed by the patient population. We propose an improved estimator of the optimal ITR that enjoys two key properties. First, it is doubly robust, meaning that the proposed estimator is consistent if either the propensity score or the outcome model is correct. Second, it achieves the smallest variance among its class of doubly robust estimators when the propensity score model is correctly specified, regardless of the specification of the outcome model. Simulation studies show that the estimated optimal ITR obtained from our method yields better clinical outcome than its main competitors. Data from Sequenced Treatment Alternatives to Relieve Depression (STAR*D) study is analyzed as an illustrative example. Key Words: Double robustness; Individualized treatment rule; Personalized medicine

Page 14: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

9

Functional Canonical Correlation and Functional Prediction

Di-Rong Chen1

1 School of Mathematics and System Sciences, Beihang University, China

E-mail: [email protected]

Abstract: Functional data analysis (FDA) is concerned with infinite-dimensional objects collected in the form of random curves. A reproducing kernel Hilbert space (RKHS) framework, based on the Loeve–Parzen congruence, has been developed in FDA to deal with the problem of non-invertibility of some compact operators, e.g, covariance operator. In this talk, we consider functional canonical correlation analysis and functional prediction in the RKHS framework. Some estimators are proposed and convergence rates are established.

Page 15: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

10

A Power One Test for Unit Roots Based on Sample

Autocovariances

Guanghui Cheng1 , Jinyuan Chang , Qiwei Yao

1 Guangzhou University, China

E-mail: [email protected]

Abstract: Testing for the existence of unit roots is a long-standing and challenging problem in time series analysis in econometrics. Most available methods suffer from a lack of size control and poor power. Perhaps more significantly the settings for unit root tests typically postulate the existence of unit roots as the null hypothesis, and, therefore, lead to innate indecisive inference as statistical tests are incapable to accept a null hypothesis. We propose a new test for the null hypothesis that the observed time series is weakly stationary (or with a linear deterministic trend) against the alternative that unit roots are present. Therefore the null hypothesis is rejected, we conclude that unit roots are indicated by significant data evidence.

Page 16: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

11

METRIC LEARNING VIA CROSS-VALIDATION

Linlin Dai1, Kani Chen, Gang Li, Yuanyuan Lin

1Southwestern University of Finance and Economics

E-mail: [email protected]

Abstract: Many algorithms rely on a good measure of the distance between two inputs. Metric learning learns a proper metric from the data, and the learned metric can then be used to derive algorithms for problem solving. It is widely applied in machine learning, data mining and pattern recognition. This paper studies multiple index model, which has attracted lots of attentions in statistics, in the framework of metric learning. The metric is incorporated in a nonparametric kernel smoothing function to approximate the link function. An optimization procedure over all positive semi-definite matrices is built based on the cross validation and is called cross validation metric learning. This procedure simultaneously estimates the link function and find the projection matrix of the multiple index model. The estimated link function and directions share the same bandwidth and thus avoid the problem caused by two separate estimation procedures existing in some other approaches. Various simulation studies and real data analyses demonstrate its advantages over other existing approaches.

Key Words: cross-validation, multiple index model, kernel estimation

Page 17: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

12

Dynamic Change Detection with False Discovery Rate

Control

Lilun Du1, Changliang Zou

2

1HKUST, HKSAR

2Nankai University, China

E-mail: [email protected]

Abstract: In multiple data stream surveillance, the rapid and sequential identification of individuals whose behaviour deviates from the norm has become particularly important. In such applications, the state of a stream can alternate, possibly multiple times, between a null state and an alternative state. To balance the ability to detect two types of changes, that is, a change from the null to the alternative and back to the null, we develop a large-scale dynamic testing system in the framework of false discovery rate (FDR) control. By fully exploiting the sequential feature of data streams, we propose a new procedure based on a penalised version of the generalised likelihood ratio test statistics for change detection. The FDR at each time point is shown to be controlled under some mild conditions on the dependence structure of data streams. A data-driven approach is developed for selection of the penalisation parameter, which gives the new method an edge over existing methods in terms of FDR control and detection delay. Its advantage is demonstrated via simulation and a data example. Key Words: Epidemic changes; High-dimensional data streams; Multiple change-points; Multiple testing; Penalised methods; Sequential detection

Page 18: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

13

Valuing commodity options and futures options with

changing economic conditions

Kun Fan1, Yang Shen

2,*, Tak Kuen Siu

3, Rongming Wang

1

1 School of Statistics, East China Normal University, Shanghai 200241, China

2 Department of Mathematics and Statistics, York University, Toronto, Ontario,

Canada, M3J 1P3 3 Department of Applied Finance Actuarial Studies, Faculty of Business and

Economics, Macquarie University, Sydney, NSW 2109, Australia

E-mail: [email protected]

Abstract: A model for valuing a European-style commodity option and a futures option is discussed with a view to incorporating the impact of changing hidden economic conditions on commodity price dynamics. The proposed model may be thought of as an extension to the Gibson–Schwartz two-factor model, where the model parameters vary when the hidden state of an economy switches. A semi-analytical approach to valuing commodity options and futures options is adopted, where the closed-form expressions for the characteristic functions of the logarithmic commodity price and futures price are derived. A fast Fourier transform (FFT) approac0h is then applied to provide a practical and efficient way to evaluate the option prices. Real data studies and numerical examples are used to illustrate the practical implementation of the model. Key Words: Commodity options; Futures options; Regime-switching; Gibson–

Schwartz two-factor model; Fast Fourier transform

Page 19: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

14

An extended Mallows model for ranked data aggregation

Han Li1,2

, Minxuan Xu2,3

, Jun S. Liu4, Xiaodan Fan

2,*

1College of Economics, Shenzhen University, Shenzhen, China

2Department of Statistics, The Chinese University of Hong Kong, Shatin, HK SAR,

China 3Department of Statistics, University of California, Los Angeles, CA, USA

4Department of Statistics, Harvard University, Cambridge, MA, USA

E-mail: [email protected]

Abstract: For a same disease, different studies may report differently ranked gene lists after measuring the gene association with the disease. We aim to find a consensus ranking by aggregating multiple ranking lists. To address the problem probabilistically, we formulate an elaborate ranking model for full and partial rankings by generalizing the Mallows model. Our model assumes that the ranked data are generated through a multistage ranking process that is explicitly governed by parameters that measure the overall quality and stability of the process. The new model is quite flexible and has a closed form expression. Under mild conditions, we can derive a few useful theoretical properties of the model. Furthermore, we propose an efficient statistic called rank coefficient to detect over-correlated rankings and a hierarchical ranking model to fit the data. Through extensive simulation studies and real applications, we evaluate the merits of our models and demonstrate that they outperform the state-of-the-art methods in diverse scenarios. Key Words: rank aggregation; Mallows model; hierarchical ranking

Page 20: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

15

Some challenges in analyzing big data in health and medical

research

Bo Fu1

1 Fudan University, China

E-mail: [email protected]

Abstract: TBA

Page 21: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

16

Estimating Truncated Functional Linear Models with a

Nested Group Bridge Approach

Tianyu Guan1, Zhenhua Lin

2, Jiguo Cao

1,*

1Department of Statistics and Actuarial Science, Simon Fraser University

2 Department of Statistics, University of California, Davis

1,* Department of Statistics and Actuarial Science, Simon Fraser University

E-mail: [email protected]

Abstract: We study a scalar-on-function truncated linear regression model which assumes that the functional predictor does not influence the response when the time

passes a certain cutoff point. We approach this problem from the perspective of locally sparse modeling, where a function is locally sparse if it is zero on a substantial portion of its defining domain. In the truncated linear model, the slope function is exactly a

locally sparse function that is zero beyond the cutoff time. A locally sparse estimate then gives rise to an estimate of the cutoff time. We propose a nested group bridge penalty that is able to specifically shrink the tail of a function. Combined with the

B-spline basis expansion and penalized least squares, the nested group bridge approach can identify the cutoff time and produce a smooth estimate of the slope function simultaneously. The proposed nested group bridge estimator is shown to be consistent, while its numerical performance is illustrated by simulation studies. The proposed nested group bridge method is demonstrated with an application of determining the effect of the past engine acceleration on the current particulate matter emission. Key Words: B-spline basis functions; Functional data analysis; Functional linear regression; Group bridge approach; Locally sparse; Penalized B-splines

Page 22: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

17

Modeling Traffic Crash Risk

Feng Guo1∗

1∗ Virginia Tech, Blacksburg, VA 24060,

USA

[email protected]

Abstract: Accurate assessment of driver risk is challenging due to the rarity of crashes and high variability among individual drivers. This talk focuses on evaluating factors affecting driving risk as well individual driving risk prediction. The emerging connected and automated vehicle technology provides rich in-situ

telematics driving data that could provide personalized driving behavior infor- mation for risk assessment. This talk will introduce the latest development in using the naturalistic driving study in driver risk prediction, specifically, the The Second Strategic Highway Research Plan naturalistic driving study (SHRP2 NDS). A decision-adjusted framework is introduced to develop an optimal model for individual driver risk prediction.

Key Words: Traffic Safety; Crash Risk; Naturalistic Driving Study; Telematics; Driver Risk Evalu- ation

Page 23: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

18

Moderate-Dimensional Inferences on Quadratic Functionals

in Ordinary Least Squares

Xiao Guo1,*

, Guang Cheng2

1International Institute of Finance, School of Management, University of Science and

Technology of China, Hefei, Anhui 230026, People's Republic of China.

2Department of Statistics, Purdue University, West Lafayette, IN 47906.

E-mail: [email protected]

Abstract: Statistical inferences on quadratic functionals of linear regression parameter have found wide applications including signal detection, one/two-sample global testing, inference of fraction of variance explained and genetic co-heritability. Conventional theory based on ordinary least squares estimator works perfectly in the fixed-dimensional regime, but fails when the parameter dimension p grows proportionally to the sample size n. In some cases, its performance is not satisfactory even when n > 5p. The main contribution of this paper is to illustrate that the signal-to-noise ratio (SNR) plays a crucial role in the moderate-dimensional inferences where p/n -> c with 0 < c < 1. In the case of weak SNR, as often occurred in the moderate-dimensional regime, both bias and variance need to be corrected in the traditional inference procedures. The amount of correction mainly depends on SNR and c, and could be fairly large as c-> 1. However, the classical fixed-dimensional results continue to hold if and only if SNR is large enough, say when p diverges but slower than n. Our general theory holds, in particular, without Gaussian design/error or structural parameter assumption, and apply to a broad class of quadratical functionals, covering all aforementioned applications. The mathematical arguments are based on random matrix theory and leave-one-out method. Extensive numerical results demonstrate the satisfactory performances of the proposed methodology even when p > 0.9n in some extreme case. Key Words: Co-heritability; fraction of variance explained; linear regression model; moderate dimension, quadratic functional; signal-to-noise ratio

Page 24: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

19

Local Inference in Additive Models with Decorrelated Local

Linear Estimator

Zijian Guo1,∗, Cun-Hui Zhang1

1 Rutgers University, USA

E-mail: [email protected]

Abstract: Additive models, as a natural generalization of linear regression, have played an important role in studying nonlinear relationships. Despite of much recent progress on additive models, the statistical inference problem in additive models has been much less understood. Motivated by infer- ence for the exposure effect, we tackle the statistical inference problem for f1

t (x0) in additive models, where f1 denotes the univariate function of interest and f1t (x0) denotes its first order derivative evalu- ated at a specific point x0. The main challenge for this local inference problem is due to the additional uncertainty of estimating other nuisance functions. To address this, we propose a decorrelated local linear estimator, which is particularly useful in reducing the effect of the estimation error related to the nuisance functions on the estimation accuracy of f1t (x0). We establish the asymptotic limiting distribution for the proposed estimator and then construct confidence interval and conduct hypothesis testing for the estimand f1t (x0). The variance level of the asymptotic limiting distribution is of the same order as that for the nonparametric regression while the bias of the proposed estimator is jointly determined by how well we can estimate the nuisance functions and the relationship between the variable of interest and the nuisance variables. The method is developed for general additive models and is demonstrated in high-dimensional sparse additive model. Key Words: Additive model; Inference; Local linear; Decorrelated; Function derivative.

Page 25: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

20

Nonlocal online RPCA for video denoising

Zhi Han1

1 Shenyang Institute of Automation, Chinese Academy of Sciences, China

E-mail: [email protected]

Abstract: Online Robust Principal Component Analysis (RPCA) has been proposed to enhance the efficiency of (batch) RPCA for processing big or streaming data. However, for both online and batch RPCA, two drawbacks prevent their applications on video denoising: 1) it is hard for them to extract accurate low-rank components when the scene or background of the video is non-static; 2) correlated information of regional patches or volumes, which may be used to further enhance the reconstruction performance, is not taken into consideration. For solving such problems, nonlocal technique has been successfully applied to batch RPCA. But due to the characteristics of online RPCA, it is hard to apply nonlocal technique directly. In this work, we propose a novel nonlocal online RPCA model for video denoising. Firstly, rather than grouping low-rank patches by applying the methods like k-Nearest Neighbor (kNN) to every patch separately, we use Gaussian Mixture Model (GMM) for clustering all the nonlocal patches, in order to build couples of updatable subspaces. Then we propose a weighted RPCA model and solve it via an Alternating Direction Method of Multipliers (ADMM) strategy. Finally, we propose a sequential multi-subspace update scheme, in which, the subspaces and the corresponding Gaussian components are updated simultaneously by utilizing the subspace-projected new coming data.

Page 26: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

21

Oracle P-value and Variable Screening

Ning Hao1, Hao Helen Zhang

1,*

1University of Arizona, Mathematics Department, Tucson, AZ, USA

E-mail: [email protected]

Abstract: P-value, first proposed by Fisher to measure inconsistency of data with a specified null hypothesis, plays a central role in statistical inference. For classical linear regression analysis, it is a standard procedure to calculate P-values for regression coefficients based on least squares estimator (LSE) to determine their significance. However, for high dimensional data when the number of predictors exceeds the sample size, ordinary least squares are no longer proper and there is not a valid definition for P-values based on LSE. It is also challenging to define sensible P-values for other high dimensional regression methods such as penalization and resampling methods. In this paper, we introduce a new concept called oracle P-value to generalize traditional P-values based on LSE to high dimensional sparse regression models. Then we propose several estimation procedures to approximate oracle P-values for real data analysis. We show that the oracle P-value framework is useful for developing new tools in high dimensional data analysis, including variable ranking, variable selection, and screening procedures with false discovery rate (FDR) control. Numerical examples are presented to demonstrate performance of the proposed methods. Key Words: False discovery rate; Inference; Linear model

Page 27: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

22

Inference in a mixture additive hazards cure model

Haijin He1

1 Shenzhen University, China

E-mail: [email protected]

Abstract: TBA

Page 28: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

23

The Pearson Correlation Between Tree-Shaped Data Sets:

Estimating, Graphical Representation and Hypothesis

Testing

Shanjun Mao1, Xiaodan Fan

1, Jie Hu

2,∗ 1

The Chinese University of Hong Kong, Hong Kong SAR,

China 2,∗ Xiamen University, Fujian, China

E-mail: [email protected]

Abstract: Tree-shaped data set is increasingly common in real world, such as gene expression data measured on a cell lineage tree. Due to its complex structure, such as changing correlation along the tree branch, classical formula of Pearson correlation coefficient cannot be applied directly to calculate correlation between tree-shaped data sets, i.e. tree correlation. In our study, a statistical model with correlation-gradually-weaken mechanism for tree-shaped data is proposed first, and then a Bayesian approach is applied to estimate the tree correlation; secondly, a simple and intuitive graph representation is used to demonstrate the geometric significance of tree correlation; finally, a χ2 test is constructed to test the hypotheses on tree correlations. Extensive simulations are completed to demonstrate the validity and compatibility of our model and algorithm. Furthermore, the application to a public dataset of gene expression measured on cell lineage tree shows that our method can capture the correlation between tree-shaped data sets well.

Key Words: Pearson correlation; Tree-shaped data; Bayesian estimation; Graphical representation; Hypothesis testing

Page 29: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

24

AI-Based Solution for Financial Risk Assessment and Fraud

Detection

Ling Huang1

1AHI Fintech, Beijing, China

E-mail: [email protected]

Abstract: In this talk, I will present our work in applying a variety of Artificial Intelligence (AI) technology to daily decision-making in business operations in financial industry. Specifically, we propose to integrate the latest graph analytics with machine learning and develop cutting edge semi-supervised learning platform to provide risk assessment and fraud detection solutions for financial institutions. Our solutions seamlessly integrate expert knowledge with AI systems, automatically discover unknown risk patterns from massive data, build models with few labels to detect ever-changing fraudulent activities unseen before, and protect customers from the latest threats and attacks. Key Words: Artificial Intelligence; Machine Learning; Risk Model; Fraud Detection

Page 30: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

25

Causal mediation of semicompeting risks

Ying Chen1, Thorsten Koch

2, Yen-Tsung Huang

1

1 Academia Sinica

E-mail: [email protected]

Abstract:The semi-competing risk problem arises when one is interested in the effect of an exposure or treatment on both intermediate (e.g., having cancer) and primary events (e.g., death) where the intermediate event may be censored by the primary event, but not vice versa. Here we propose a nonparametric approach casting the semi-competing risks problem in the framework of causal mediation modeling. We set up a mediation model with the intermediate and primary events, respectively as the mediator and the outcome, and define indirect effect (IE) as the effect of the exposure on the primary event mediated by the intermediate event and direct effect (DE) as that not mediated by the intermediate event. A Nelson-Aalen type of estimator with time-varying weights is proposed for direct and indirect effects where the counting process at time $t$ of the primary event $N_2n_1(t)$ and its compensator $A_n_1(t)$ are both defined conditional on the status of the intermediated event right before $t$, $N_1(t^-)=n_1$. We show that $N_2n_1(t)-A_n_1(t)$ is a zero-mean martingale. Based on this, we further establish the asymptotic unbiasedness, consistency and asymptotic normality for the proposed estimators. Numerical studies including simulation and data application are presented to illustrate the finite sample performance and utility of the proposed method. Key words: causal inference; causal mediation; martingale; Nelson-Aalen estimator; semi-competing risks

Page 31: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

26

Multiple Imputation on enhanced model identification for

nonignorable nonresponse

Jongho Im1,∗

, Kosuke Morikawa 2

, Tomoyuki Nakagawa3

1∗Presenter, Department of Applied Statistics, Yonsei University, Seoul, South Korea

2

Graduate School of Engineering Science, Osaka University, Osaka, Japan 3

Department of Information Sciences, Faculty of Science and Technology, Tokyo, Tokyo

University of Science

E-mail: [email protected]

Abstract: Multiple imputation is a popular technique for analyzing incomplete data. Although mul- tiple imputation usually assumed ignorable missing mechanism in the early days, many new methods have proposed for nonignorable missing mechanism in recent years. However, those methods are still having limitations in robustness to model specification and identification. In this study, we propose a new multiple imputation method strengthened model identification. The outcome model is approx- imated by Gaussian mixtures and the response mechanism is assumed to be a logit model. As the response model is parametric but cannot be correctly specified in practice, the model identification is important to obtain consistent estimates. We discuss when the response model can be safely identi- fied under the presence of nonignorable nonresponse and then use the results to estimate our assumed model parameters. Results from a limited simulation study are presented to check the performance of the proposed multiple imputation method.

Key Words: data augmentation; Gaussian mixture; not missing at random; model identification

Page 32: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

27

Generalized Four Moment Theorem and an Application to

CLT for Spiked Eigenvalues of Large-dimensional

Covariance Matrices

Dandan Jiang 1

1 School of Mathematics & Statistics,

Xi'an Jiaotong University,No.28, Xianning West Road, Xi'an, Shanxi, 710049, China

E-mail: [email protected]

Abstract: We consider a more generalized spiked covariance matrix $\Sigma$, which is a general non-definite matrix with the spiked eigenvalues scattered into a few bulks and the largest ones allowed to tend to infinity. By relaxing the matching of the 4th moment to a tail probability decay, a Generalized Four Moment Theorem (G4MT) is proposed to show the universality of the asymptotic law for the local spectral statistics of generalized spiked covariance matrices, which implies the limiting distribution of the spiked eigenvalues of the generalized spiked covariance matrix is independent of the actual distributions of the samples satisfying our relaxed assumptions. Moreover, by applying it to the Central Limit Theorem (CLT) for the spiked eigenvalues of the generalized spiked covariance matrix, we also extend the result of Bai and Yao (2012) to \colorreda general case that the 4th moment is not necessarily required to exist and the population covariance matrix is in a general form without diagonal block independent assumption, thus meeting the actual cases better.

Page 33: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

28

Functional-coefficient regression models with GARCH

errors

Jiancheng Jiang1,2*

1 School of Mathematical Sciences, Nankai University, Tianjin 300071, China 2 Department of Mathematics and Statistics, University of North Carolina at

Charlotte, NC 28277, USA

E-mail: [email protected]

Abstract: The GARCH models are widely used to model various heavily tailed financial data with nonlinearity structures and heteroscedasticity structures. In this talk, we propose a functional-coefficient regression model with GARCH errors to model these kinds of data. To deal with the influence of heteroscedasticity, we introduce a two-step approach to estimating the unknown coefficient functions and the volatility. With the estimates of unknown parameters and functions, one may consider making simultaneous inference about parameters and making prediction for the conditional mean and volatility. Asymptotic properties of the proposed estimators are established. Our results demonstrate that the functional coefficients can be estimated as if the volatility were known. Simulations and real data examples show that the proposed estimators significantly improve the estimation efficiency of the unweighted estimators when there are GARCH effects. Key Words: functional coefficients; GARCH errors; local linear smoothing; QMLE

Page 34: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

29

Prediction of hospital readmission frailties with misspecified

shared frailty models

Xuejun Jiang 1

1 Department of Mathematics, Southern University of Science and Technology,

Shenzhen 518055, China

E-mail: [email protected]

Abstract: In this paper we study prediction of hospital readmission frailties using var- ious misspecified shared frailty models. We review commonly used frailty distributions including the gamma, positive stable, power variance function, lognormal, Weibull, and compound Poisson distributions. We employ the EM algorithm and the penalized like- lihood for estimating the parameters. Based on these estimates, we construct the best prediction of the risk. We conduct simulations to evaluate the performance of different misspecified shared frailty models and to examine if the best prediction is sensitive to the specification of frailty. Finally we predict the risk of hospital readmission using misspecified frailty models. Key Words: EM algorithm; Frailty; Misspecification; Prediction; Penalized likelihood

Page 35: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

30

The Operating Principle of Regularized Spectral Clustering

Juhee Cho1, Donggyu Kim

2, Karl Rohe

3, Song Wang

4,*

1Microsoft, USA

2KAIST, South Korea

3University of Wisconsin-Madison, Wisconsin Madison, USA

4Amazon, USA

E-mail: [email protected]

Abstract:Spectral Clustering is one of the most popular modern clustering algorithms. Its performance can be improved using regularization and we call it Regularized Spectral Clustering. Despite Regularized Spectral Clustering is widely used, its statistical operating principle is not well studied. In this paper, we investigate why Regularized Spectral Clustering works well under the degree-corrected stochastic block model consisting of dense cohesive cores and sparse loosely-connected peripheries. We show that the regularized row-and-column-normalized adjacency matrix has the eigenvalues with the order of increasing the expected degree. Thus, the regularization makes eigenvectors corresponding to dense cohesive cores come first which helps Regularized Spectral Clustering provide a more balanced cut than Spectral Clustering. A simulation study is conducted to check the properties that we find, and we also apply Regularized Spectral Clustering to the several real network data. Key Words: network; community detection; degree-corrected stochastic block model; core/periphery structure

Page 36: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

31

Discrepancy between global and local principal component

analysis on large-panel high-frequency data

Xin-Bing Kong1, Jin-Guan Lin

1, Cheng Liu

2, Guang-Ying Liu

1

1Nanjing Audit University, Nanjing, China

2Wuhan University, Wuhan, Chia

E-mail: [email protected]

Abstract: The global principal component analysis (GPCA), PCA directly applied to the whole sample, is not reliable to reconstruct the common components of a large panel of high-frequency data when the factor loadings are time-varying, but it works when the factor loadings are constant. However, the local principal component analysis (LPCA) presented in Kong (2017)(2018) results in consistent estimates of the common components even if the factor loading processes follow Ito semimartingales. The LPCA is also suited for online computation in ``big data" framework with restricted storage and memory.This motivates us to study the discrepancy between the GPCA and LPCA in recovering the common components of the large-panel high-frequency data. In this paper, we measure the discrepancy by the total sum of squared differences between common components reconstructed from GPCA and LPCA. We provide the asymptotic distribution of the discrepancy measure when the factor loadings are constant. Alternatively when some factor loadings are time-varying, the discrepancy measure explodes in a rate higher than root pk_n under some mild conditions on the time-variation magnitude of the factor loadings where k_n is the size of each subsample. We then apply the theory to testing the hypothesis that the factor loading matrix is a constant matrix. We show that the test performs well in controlling the type I error and detecting time-varying loadings. Our real data analysis provides evidence that the factor loading matrices are always time-varying.

Page 37: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

32

Optimal Estimation of Wasserstein Distance on Trees with

An Application to Microbiome Studies

Hongzhe Li1

1University of Pennsylvania, Philadelphia, PA 19104, USA

E-mail: [email protected]

Abstract: Weighted UniFrac distance, a plug-in estimator of the Wasserstein distance of read counts on a tree, has been widely used to investigate the microbial community difference in microbiome studies. Our investigation however shows that such a plug-in estimator, although intuitive and commonly used in practice, suffers from potential bias. Motivated by this, we study the problem of estimating the Wasserstein distance between two distributions on a tree from the sampled data in high-dimensional but non-asymptotic setting. To overcome the bias problem, we introduce a new estimator, referred to as moment-screening estimator on a tree (MET), by conducting implicit polynomial approximation that incorporates the tree structure. The new estimator is computationally efficient and is shown to be minimax rate-optimal. Applications to both simulated and real biological datasets demonstrate the practical merits of MET, including reduced biases and statistically more significant differences in microbiome between inactive Crohn's disease patients and the normal controls. Key Words: Phylogenetic tree; UniFrac distance; 16S rRNA sequencing; polynomial approximation

Page 38: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

33

A supplement to Jiang’s asymptotic distribution of the

largest entry of a sample correlation matrix

Deli LI1

1Lakehead University, Canada

E-mail: [email protected]

Abstract: Let ,, ; 1, 1

k iX X i k be an array of i.i.d. random variables and let

; 1n

p n be a sequence of positive integers such that /n

n p is bounded away from

0 and . The sample correlation matrix ( )

,

n n

n

n i j

p p

p

is generated from

1,1 , , ,1 1 , , , ,, ,

n nn p n p

X X X X

such that

( )

,

n

i jp

is the Pearson correlation coefficient

between 1, , ,,

i n iX X

and 1, , ,

,j n j

X X

. In this talk, we provide a supplement to

Jiang‟s asymptotic distribution of the largest entry ( )

1 ,m a x

n

n

n i j p i jL p

. We show that,

for given nondecreasing function : 0 , 0 ,h with limx

h x

, there

exists an array ,, ; 1, 1

k iX X i k of symmetric i.i.d. random vatiables such that

h X and, for some subsequence ; 1m

n m of 1; 2; 3; , 1 / 2

lim 2lo g

m

m

nm

m

nL

n

almost surely,

1 / 2

limlo g

nn

m

nL

n

does not exist almost surely,

2 / 21

lim e x p ,

8m m

t

m n nm

n L a t e t

,

and 2

n nn L a does not convergence in distribution

where 4 lo g lo g lo g , 2n n n

a p p n .

Page 39: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

34

High-Dimensional Vector Autoregressive Time

Series Modeling via Tensor Decomposition

Di Wang1, Heng Lian

2, Wai Keung Li

1, Guodong Li

1∗

1* Department of Statistics and Actuarial Science, University of Hong Kong,

HKSAR 2 Department of Mathematics, City University of Hong Kong, HKSAR

E-mail: [email protected]

Abstract: The classical vector autoregressive model is a fundamental tool for multivariate time series analysis. However, it involves too many parameters for high-dimensional time series, and hence suf- fers from the curse of dimensionality. In this paper, we rearrange the parameter matrices of a vector autoregressive model into a tensor form, and use the tensor decomposition to restrict the parameter space in three directions. Compared with the reduced-rank regression method, which can limit the parameter space in one direction only, the proposed method dramatically improves the capability of vector autoregressive models in handling high-dimensional time series. For this method, its asymp- totic properties are studied and an alternating least squares algorithm is suggested. Moreover, for the case with much higher dimension, we further assume the sparsity of three loading matrices, and the regularization method is thus considered for estimation and variable selection. An ADMM-based algorithm is proposed for the regularized method and oracle inequalities for the global minimizer are established.

Key Words: High dimensional time series; Reduced-rank regression; Regularization; Tucker decom- position; Variable selection

Page 40: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

35

Tensor Analysis and Neuroimaging Applications

Lexin Li1

1 Department of Biostatistics and Epidemiology, University of California, Berkeley

E-mail: [email protected]

Abstract: Data in the form of multidimensional array, or tensor, are fast emerging in a wide variety of scientific and business applications. Simply turning an array into a vector would both induce extremely high dimensionality and destroy the inherent structure of the array. In this talk, we discuss two tensor analysis problems, one about regression with a tensor-valued response, and the other about dynamic tensor clustering. We exploit the special structure of tensor, and introduce two low-dimensional structures: sparsity and low-rankness, which helps bring the ultrahigh dimensionality to a manageable level. We develop fast estimation algorithms, and derive the associated asymptotic properties. We illustrate with two applications in brain imaging analysis.

Page 41: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

36

Statistical Learning for Personalized Wealth Management

Yi Ding1, Yingying Li

2∗, Rui Song3

1 Hong Kong University of Science and Technology, HKSAR 2∗Hong Kong University of Science and Technology, HKSAR

3 North Carolina State University, USA

E-mail: [email protected]

Abstract: We establish a statistical learning framework for personalized wealth management. A high- dimensional Q-learning methodology is proposed for continuous decision making. The proposed method is shown to enjoy desirable oracle properties and facilitate valid statitical inference for opti- mal values. Empirically, the proposed statistical learning methodology is exercised with Health and Retirement Study data. The results show that the proposed personalized optimal strategy can improve individual‟s financial well-being and surpasses benchmark strategies under a consumption based util- ity framework. Key Words: High-dimensionality; Statistical Learning; Wealth Management

Page 42: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

37

Mediation analysis for zero-inflated mediators

Zhigang Li1

1 University of Florida, USA

E-mail: [email protected]

Abstract: TBA

Page 43: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

38

Identiability and Non-Convex Algorithm for Multi-Channel

Blind Deconvolution

Song Li1

1 School of Mathematical Sciences, Zhejiang University, China

E-mail: [email protected]

Abstract: In this talk, we consider the multichannel blind deconvolution problem. We propose de- terministic subspace assumption, which is widely used in practice, and give some theoretical results. First of all, we derive tight sufficient condition for identiability of signal and convolution kernels, which is only violated on a set of Lebesgue measure zero. Then, we present a non-convex regulariza- tion algorithm by a lifting method and approximate the rank-one constraint and show that the global minimizer of the proposed non-convex algorithm is rank-one matrix under mild conditions on param- eters and noise level. The stability result is also shown under the assumption that the inputs lie in a compact set. Finally, we provide numerical experiments to show that our non-convex model outper- forms convex relaxation models, such as nuclear norm minimization and some non-convex methods (alternating minimization method and spectral method).

Page 44: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

39

A non-randomized multiple testing procedure for large-scale

heterogeneous discrete hypotheses based on randomized

tests

Xiaoyu Dai1, Nan Lin

1,*, Daofeng Li

2, Ting Wang

2

1Department of Mathematics and Statistics, Washington University in St. Louis, USA

2Department of Genetics, Washington University in St. Louis, USA

E-mail: [email protected]

Abstract: High-throughput genomic studies often require performing genome-wide multiple discrete testing. However, most existing multiple testing procedures for controlling the false discovery rate (FDR) assume that test statistics are continuous and become conservative for discrete tests. To overcome the conservativeness, we propose a novel multiple testing procedure for better FDR control on heterogeneous discrete tests. Our procedure makes decisions based on the marginal critical function (MCF) of randomized tests, which enables achieving a powerful and non‐randomized multiple testing procedure. We provide upper bounds of the positive FDR (pFDR) for our procedure and show that the set of detections made by our method contains every detection made by a naive application of the widely used q‐value method. We further demonstrate our method by simulations and a real example of differentially methylated region (DMR) detection using whole‐genome bisulfite sequencing (WGBS) data. Key Words: multiple testing; next-generation sequencing; randomized test

Page 45: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

40

Deep Neural Networks for Rotation-Invariance

Approximation and Learning

Shao-Bo Lin1

1Wenzhou University, 325035, China

E-mail: [email protected]

Abstract: The objective of this talk is to design deep neural networks with two or more hidden layers (called deep nets) for realization of radial functions so as to enable rotational invariance for near-optimal function approximation in an arbitrarily high dimensional Euclidian space. It is shown that deep nets have much better performance than shallow nets (with only one hidden layer) in terms of approximation accuracy and learning capabilities. In particular, for learning radial functions, it is shown that near-optimal rate can be achieved by deep nets but not by shallow nets. Our results illustrate the necessity of depth in neural network design for realization of rotation-invariance target functions. Key Words: Deep nets; rotation-invariance; learning theory; radial-basis functions

Page 46: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

41

A Quantile Association-based Variable Selection

Yuanyuan Lin1, Niansheng Tang

2, Jinhan Xie

2, Wenlu Tang

1

1The Chinese University of Hong Kong, Hong Kong, China

2Key Lab of Statistical Modeling and Data Analysis of Yunnan Province,

Yunnan University, Kunming 650091, China.

E-mail: [email protected]

Abstract: In modern scientific discoveries, important variables identification in analyzing high dimensional data is intrinsically challenging, especially when there are complex relationships among predictors. In this paper, without any specification of a regression model, we introduce a quantile association-based statistic to identify influential predictors, which is flexible enough to capture a wide range of dependence. The asymptotic null distribution of the proposed statistic is established under mild conditions. Based on the proposed statistic, a multiple testing procedure is advocated to simultaneously test the independence between each predictor and the response variable in high dimensionality. The proposed procedure is able to detect relevant variables with pairwise or higher-order interactions. It is computationally efficient as no optimization or resampling is involved. We prove its theoretical properties and justify the proposal asymptotically controls the false discovery rate at a given significance level. Numerical studies including simulation studies and real data analysis contain supporting evidence that the proposal performs reasonably well in practical settings. Key Words: Quantile association; high dimensionality

Page 47: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

42

Some Statistical Methods for Single-cell Genomics

Zhixiang Lin1,*

1Department of Statistics, the Chinese University of Hong Kong, HKSAR

E-mail: [email protected]

Abstract: Some recent work regarding the analysis of single-cell chromatin accessibility (scATAC-Seq) and single-cell gene expression (scRNA-Seq) data will be discussed. First, we will present scABC, a toolkit for analyzing scATAC-Seq data, and extensions that improve rare cell population detection. Second, a model-based framework for the joint analysis of scATAC-Seq and scRNA-Seq data will be presented. We show that the underlying cell types can be better characterized through the joint analysis. Key Words: Single-cell genomics; Bayesian modeling; Clustering

Page 48: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

43

Weighted multiple-quantile classifiers for functional data

with application in multiple sclerosis screening

Haiqiang Ma2, Sheng Xu

3, Catherine Liu

1,* Kam Chuen Yuen

4

1,3The Hong Kong Polytechnic Univeristy, Hung Hom, Kowloon, HKSAR

2Jiangxi University of Finance and Economics, Nanchang, China

4 University of Hong Kong, Po Fu Lam Road, HKSAR

E-mail: [email protected]

Abstract: Multiple sclerosis (MS) is the most prevalent chronic neurological disease. It can be diagnosed by functional data generated from diffusion tensor imaging. Early recognition and treatment of MS are crucial in the treatment and management of MS patients. Existing functional classifiers seem to suffer from high false negative rates or high false positive rates or both. To develop a classifier with low false negative and false positive rates, we define a generalized distance measure for the functional data. Using this generalized distance, we show that the existing classifiers can be derived by choosing appropriate loss functions. Furthermore, when we consider the quantile loss function, we are able to develop a weighted multiple-quantile (weMulQ) classifier that is robust, accurate, and computationally fast. We showed that it is asymptotically consistent and enjoys the near perfection optimality. Numerically, we demonstrate that it outperforms the other methods when the data are from a generalized Gaussian noise process with mixed populations. Finally, we apply weMulQ to classify MS patients using a DTI data set collected from the Johns Hopkins University and the Kennedy-Krieger Institute. Our classifier indeed has much lower false negative and false positive rates than the existing methods. Key Words: almost perfection; classification; quantile

Page 49: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

44

Optimal Covariance Matrix Estimation for

High-dimensional Noise in High-frequency Data

Cheng Yong Tang1, Cheng Liu

2, Jinyuan Chang

3

1Temple University,

2Wuhuan University,

3Southwestern University of Finance and Economics

E-mail: [email protected]

Abstract: In this paper, we consider efficiently learning the structural information from the high-dimensional noise in high-frequency data via estimating its covariance matrix with optimality. The problem is uniquely challenging due to the latency of the targeted high-dimensional vector containing the noises, and the practical reality that the observed data can be highly asynchronous-not all components of the high-dimensional vector are observed at the same time points. To meet the challenges, we propose a new covariance matrix estimator with appropriate localization and thresholding. In the setting with latency and asynchronous observations, our theoretical analysis establishes the minimax optimal convergence rates associated with two common loss functions for the covariance matrix estimations. As a major theoretical development, we show that despite the latency of the signal in the high-frequency data, the optimal rates remain the same as if the targeted high-dimensional noises are directly observable. Our results indicate that the optimal rates reflect the impact due to the asynchronous observations, which are slower than that with synchronous observations. Furthermore, we demonstrate that the proposed localized estimator with thresholding achieves the minimax optimal convergence rates. We also illustrate the empirical performance of the proposed estimator with extensive simulation studies and a real data analysis. Key Words: High-dimensional covariance matrix; high-frequency data analysis; measurement error, minimax optimality; thresholding

Page 50: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

45

Data-adaptive Kernel Support Vector Machine

Liu, X. 1∗, He, W.

2

1 Shanghai University of Finance and Economics, 200433, China

2 he University of Western Ontario, N6A 3K7, Canada

E-mail: [email protected]

Abstract: The support vector machine (SVM) is popularly used as a classifier in applications such as pattern recognition, texture mining and image retrieval owing to its flexibility and interpretability. However, its performance deteriorates when the response classes are imbalanced. To enhance the performance of the support vector machine classifier in the imbalanced cases we investigate a new two stage method by adaptively scaling the kernel function. Based on the information obtained from the standard SVM in the first stage, we conformally rescale the kernel function in a data adaptive fashion in the second stage so that the separation between two classes can be effectively enlarged especially when observations are imbalanced. The proposed method takes into account the location of the support vectors in the feature space, therefore is especially appealing when the response classes are imbalanced. Simultaneously, we consider how to select the important features with data-adaptive kernels in SVMs, and spatial association that may exist. The approach is further applied in multi- category classification problems. Key Words: Classification; imbalanced image data; spatial association; support vector machine; feature selection.

Page 51: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

46

Testing of covariate effects under ridge regression for

high-dimensional data

Xu Liu1

1Shanghai University of Finance and Economics,No. 777, Guoding Lu, China

E-mail: [email protected]

Abstract: In this paper, we revisit the ridge regression under high dimensional model settings. We propose a novel estimator of error's variance and establish its asymptotic normality property based on the random matrix theories as the dimension of covariates diverges with the sample size, which is promising compared with its competitors including the refitted cross validation method. An upper bound of mean squared error (MSE) is given for the ridge regression estimator of coefficients, and two new test statistics are provided for the inference on the covariate effects. %One is based on the measurement between estimator and true of coefficients, motivated by MSE. Another is motivated by famous Wald-type test in low-dimensional situation. Asymptotic properties are obtained under some regularity conditions. Numerical examples are used to assess the finite sample performance of the proposed methods. Key Words: High-dimensional test; Random matrix; Ridge regression

Page 52: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

47

Towards Software-Defined Infrastructure for Decentralized

Data Governance

Xuanzhe Liu1

1Peking University, No.5, Yiheyuan Road, Haidian District, Beijing, China

E-mail: [email protected]

Abstract: Exploring and mining the explosive burst of “big data” has already generated a lot of innovative applications, especially the recent advances of AI applications, and thus produced big values to the human society and civilization. However, due to the centralized patterns of data governance activities, including creation, sharing, exchange, management, analytics, tracing, and accounting, the potential values of big data distributed on the Internet are far away from being adequately explored. The recent announcement of data protection policies/laws such as GDPR makes the problem even more challenging. We are now at a moment of truth where the data governance infrastructure should be reconsidered and redesigned. In this paper, we propose a software-defined infrastructure design in a decentralized fashion: data owners are able to implement and deploy their own rules to the application systems where the data are produced for further governance activities. Such a fashion is quite similar to the popular software-defined networking where users are allowed to deploy rules of switches and customize the uses. Our principled infrastructure design can radically reform the current data governance activities into a decentralized topology. On the one hand, data can be separated from the application that generates the data, and data owners can have the full rights to decide where their data should be stored and how the data can be shared. On the other hand, data users can search, discover, integrate, and analyze the data from various data sources according to their application requirements and scenarios. As a result, we argue that our infrastructure can establish a new generation of responsive decentralized data governance that can promote the innovation of linking data to the better adaptation of the open environment and the diverse user requirements. With this perspective, we briefly discuss some key insights and enumerate several related new technologies and open challenges. Key Words: data governance; software-defined

Page 53: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

48

Distributed learning from multiple EHR databases:

Contextual embedding models for predicting medical events

Qi Long1

1University of Pennsylvania, USA

E-mail: [email protected]

Abstract: P-value, first proposed by Fisher to measure inconsistency of data with a specified null hypothesis, plays a central role in statistical inference. For classical linear regression analysis, it is a standard procedure to calculate P-values for regression coefficients based on least squares estimator (LSE) to determine their significance. However, for high dimensional data when the number of predictors exceeds the sample size, ordinary least squares are no longer proper and there is not a valid definition for P-values based on LSE. It is also challenging to define sensible P-values for other high dimensional regression methods such as penalization and resampling methods. In this paper, we introduce a new concept called oracle P-value to generalize traditional P-values based on LSE to high dimensional sparse regression models. Then we propose several estimation procedures to approximate oracle P-values for real data analysis. We show that the oracle P-value framework is useful for developing new tools in high dimensional data analysis, including variable ranking, variable selection, and screening procedures with false discovery rate (FDR) control. Numerical examples are presented to demonstrate performance of the proposed methods. Key Words: False discovery rate; Inference; Linear model

Page 54: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

49

Wavelet Empirical Likelihood Estimator for Stationary and

Locally Stationary Long Memory Processes

Zhiping Lu 1, Yulin Zhu

1East China Normal University, China

E-mail: [email protected]

Abstract: This paper introduces a version of wavelet empirical likelihood based on the periodogram and spectral estimating equations. This formulation handles dependent data through a data transformation. The method results in likelihood ratios which can be used to build nonparametric, asymptotically correct confidence regions of stationary long memory processes and locally stationary long memory processes. The Monte Carlo simulation is carried out to prove the effectiveness of the method. Key Words: Whittle estimator; Empirical likelihood; Wavelet transform; Locally stationary; Long memory processes

Page 55: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

50

GMV Prediction Using Driver Preference

Guojun Wu1, Yanhua Li

2, Shikai Luo

1,*

1Didi Chuxing, Beijing, China

2Worcester Polytechnics Institute, MA, US

E-mail: [email protected]

Abstract: Drivers are making a sequence of decisions while working, one of the most important decisions is whether to stop working (log off) after finishing an order or being idle for a while. We model drivers‟ such sequential decision-making process as a Markov Decision Process (MDP). We extract two types of features, i.e. income related and user-experience related, to model drivers‟ decision space. The reward function represents the preference each driver has over different decision-making features. We utilize inverse reinforcement learning to extract drivers‟ individual preference from intra-day working cycles. We are interested in predicting each driver‟s future income.

We show that models with individual preference can improve the prediction accuracy a lot over models with only inter-day characteristics. Key Words: data mining; inverse reinforcement learning

Page 56: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

51

A Nonparametric Bayesian Approach to Simultaneous

Subject and Cell Heterogeneity Discovery for Single Cell

RNA-Seq Data

Qiuyu Wu1, Xiangyu Luo1,*

1Institute of Statistics and Big Data, Renmin University of China,

Beijing, China E-mail: [email protected]

Abstract: The advent of the single cell sequencing era opens new avenues for the personalized treatment. The first but important step is discovering the subject heterogeneity at the single cell resolution. In this talk, we address the two-level-clustering problem of simultaneous subject subgroup discovery (subject level) and cell type detection (cell level) based on the single cell RNA sequencing (scRNA-seq) data from multiple subjects. However, current approaches either cluster cells without considering the subject heterogeneity or group subjects not using the single cell information. We develop a solid nonparametric Bayesian model SCSC (Subject and Cell clustering for Single-Cell data) to achieve subject and cell grouping at the same time without pre-specifying the subject subgroup number or the cell type number. An efficient blocked Gibbs sampler is then proposed for the posterior inference. The simulation study and the real application demonstrate the good performance of our model. Key Words: nonparametric Bayes; mixture model; nonignorable missing; MCMC; single cell genomics

Page 57: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

52

A Versatile Estimation Procedure without Estimating the

Nonignorable Missingness Mechanism

Yanyuan Ma1

1Penn State University, USA

E-mail: [email protected] Abstract: We consider the estimation problem in a regression setting where the outcome variable is subject to nonignorable missingness and identiability is ensured by the shadow variable approach. We propose a versatile estimation procedure where model-ing of missingness mechanism is completely bypassed. We show that our estimator is easy to implement and we derive the asymptotic theory of the proposed estimator. We also investigate some alternative estimators under dierent scenarios. Comprehensive simulation studies are conducted to demonstrate the nite sample performance of the method. We apply the estimator to a children's mental health study to illustrate its usefulness.

Page 58: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

53

Matrix Completion under Low-Rank Missing Mechanism

Xiaojun Mao1, Raymond Wong

2, Song Xi Chen

3,*

1School of Data Science, Fudan University, Shanghai 200433, China

2Department of Statistics, Texas A&M University, College Station, Texas 77843,

U.S.A. 3Department of Business Statistics and Econometrics, Guanghua School of

Management,

and Center for Statistical Science, Peking University, Beijing 100651, China.

E-mail: [email protected]

Abstract: Matrix completion is a modern missing data problem where both the missing structure and the underlying parameter are high dimensional. Although the missing structure is a key component to any missing data problems, existing matrix completion methods often assume a simple uniform missing mechanism. In this work, we study matrix completion from corrupted data under a novel low-rank missing mechanism. The probability matrix of observation is estimated via a high dimensional low-rank matrix estimation procedure and further used to complete the target matrix via inverse probabilities weighting. Due to both high dimensional and extreme (i.e., very small) nature of the true probability matrix, the effect of inverse probability weighting requires careful study. We derive optimal asymptotic convergence rates of the proposed estimators for both the observation probabilities and the target matrix. Key Words: Low-rank; Missing; Nuclear-norm; Regularization.

Page 59: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

54

A mean field theory of two-layers neural networks

Song Mei1

1Stanford University, USA

E-mail: [email protected].

Abstract: Multi-layer neural networks are among the most powerful models in machine learning, yet the fundamental reasons for this success defy mathematical understanding. Learning a neural network requires to optimize a non-convex high-dimensional objective (risk function), a problem which is usually attacked using stochastic gradient descent (SGD). Does SGD converge to a global optimum of the risk or only to a local optimum? In the first case, does this happen because local minima are absent, or because SGD somehow avoids them? In the second, why do local minima reached by SGD have good generalization properties? In this talk we consider a simple case, namely two-layers neural networks, and prove that -in a suitable scaling limit- SGD dynamics is captured by a certain non-linear partial differential equation (PDE) that we call distributional dynamics (DD). We then consider several specific examples, and show how DD can be used to prove convergence of SGD to networks with nearly ideal generalization error. This description allows to 'average-out' some of the complexities of the landscape of neural networks, and can be used to prove a general convergence result for noisy SGD. Key Words: mean field theory; stochastic gradient descent; distributional dynamics

Page 60: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

55

A Dynamic Additive and Multiplicative Effects Model with

Application to the United Nations Voting Behaviours

Bomin Kim1, Xiaoyue Niu

2,*, David Hunter

2, Xun Cao

2

1Freddie Mac, United States of America

2The Pennsylvania State University, United States of America

E-mail: [email protected]

Abstract: In this paper, we introduce a statistical regression model for discrete-time networks that are correlated over time. Our model is a dynamic version of a Gaussian additive and multiplicative effects (DAME) model which extends the latent factor network model of Hoff (2009) and the additive and multiplicative effects model of Hoff et al. (2014), by incorporating the temporal correlation structure into the prior specifications of the parameters. The temporal evolution of the network is modeled through a Gaussian process (GP) as in Durante and Dunson (2014), where we estimate the unknown covariance structure from the dataset. We analyze the United Nations General Assembly voting data from 1983 to 2014 (Voeten (2013)) and show the effectiveness of our model at inferring the dyadic dependence structure among the international voting behaviors as well as allowing for a varying number of nodes over time. Overall, the DAME model shows significantly better fit to the dataset compared to alternative approaches. Moreover, after controlling for other dyadic covariates such as geographic distances and bilateral trade between countries, the model-estimated additive effects, multiplicative effects, and their movements reveal interesting and meaningful foreign policy positions and alliances of various countries.

Key Words: dynamic network; latent variable; additive and multiplicative effects

Page 61: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

56

A Super Scalable Algorithm for Short Segment Detection

Feifei Xiao, Heping Zhang, Ning Hao, Yue Niu1

1University of Arizona, 617 N Santa Rita Ave, USA

E-mail: [email protected]

Abstract: In many applications such as copy number variant (CNV) detection, the goal

is to identify short segments on which the observations have different means or medians

from the background. Those segments are usually short and hidden in a long sequence,

and hence are very challenging to find. We study a super scalable short segment (4S)

detection algorithm in this paper. This nonparametric method clusters the locations

where the observations exceed a threshold for segment detection. It is computationally

efficient and does not rely on Gaussian noise assumption. Moreover, we develop a

framework to assign significance levels for detected segments. We demonstrate the

advantages of our proposed method by theoretical, simulation, and real data studies.

Key Words: copy number variant; scalable; segementation

Page 62: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

57

Improved doubly robust estimation in learning

individualized treatment rules

Yinghao Pan1, Yingqi Zhao

2

1University of North Carolina at Charlotte, U.S.A

2Fred Hutchinson Cancer Research Center, U.S.A

E-mail: [email protected]

Abstract: Due to patient's heterogeneous response to treatment, there is a growing interest in developing novel and efficient statistical methods in estimating individualized treatment rules (ITRs). The central idea is to recommend treatment according to patient characteristics, and the optimal ITR is the one that maximizes the expected clinical outcome if followed by the patient population. We propose an improved estimator of the optimal ITR that enjoys two key properties. First, it is doubly robust, meaning that the proposed estimator is consistent if either the propensity score or the outcome model is correct. Second, it achieves the smallest variance among its class of doubly robust estimators when the propensity score model is correctly specified, regardless of the specification of the outcome model. Simulation studies show that the estimated optimal ITR obtained from our method yields better clinical outcome than its main competitors. Data from Sequenced Treatment Alternatives to Relieve Depression (STAR*D) study is analyzed as an illustrative example. Key Words: Double robustness; Individualized treatment rule; Personalized medicine

Page 63: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

58

Predicting terrorist events: opportunities and challenges

Andre Python1,*

, Tim Lucas1, Penelope Hancock

1, Andreas Bender

1, Rohan

Arambepola1, and Anita Nandi

1

1Li Ka Shing Centre for Health Information and Discovery, Big Data Institute,

Nuffield Department of Medicine, University of Oxford, UK

E-mail: [email protected]

Abstract: In 2017, 18,700 people lost their life due to terrorist attacks. Iraq, Syria, Pakistan, and Afghanistan counted more than half of the attacks and the total number of deaths attributed to terrorism worldwide. Determining the location and time of terrorist events are key elements to prevent terrorism. So far, predictive approaches have mainly focused on armed conflict and insurgency without considering terrorism. In this talk, I will first briefly review recent approaches that have been used to predict terrorism. Second, I will introduce the eXtreme Gradient Boosting (XGboost), a machine-learning algorithm used to predict a week ahead the locations of terrorist events at fine spatial scale in Iraq, Iran, Afghanistan, and Pakistan. I will conclude the talk by comparing the predictive performance of XGboost approach with alternative machine learning approaches and spatio-temporal models. Key Words: machine learning; terrorism; prediction; space-time

Page 64: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

59

On the ‘Off-Label’ Use of Data Normalization for Sample

Classification and Prognostication

Li-Xuan Qin1,*

1Memorial Sloan Kettering Cancer Center, New York, New York, USA

E-mail: [email protected]

Abstract: Data normalization is an important preprocessing step for molecular data containing unwanted data variation due to experimental handling. There has been a critical yet over-looked disconnection between the use of data normalization and the goals of subsequent analysis: on one hand, methods for data normalization have been mostly developed for the analysis goal of group comparison; on the other hand, these methods have encountered frequent „off-label‟ use for other goals such as sample

classification, neglecting the impact of potential „side-effects‟ of normalization such as

over-compressed data variability. A bridge between these two is made possible by a unique pair of microRNA array datasets on the same set of tumor tissue samples that were collected at Memorial Sloan Kettering Cancer Center. In this talk, I will share our findings, through empirical analysis and re-sampling-based simulations using this dataset pair, on how data normalization impacts the development of tumor sample classifiers and survival outcome predictors. Key Words: Genomics; Microarray; Normalization; Classification; Prediction; Personalized medicine

Page 65: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

60

Adaptive Minimax Density Estimation for Huber’s

Contamination Model under $L_p$ Losses

Zhao Ren1

1University of Pittsburgh, Pittsburgh, PA, 15260, USA

E-mail: [email protected]

Abstract: Today's data pose unprecedented challenges as it may be incomplete, corrupted or exposed to some unknown source of contamination. In this talk, we address the problem of density function $f$ estimation under $L_p$ losses ($1\leq p <\infty$) for Huber's contamination model in which one observes i.i.d. observations from $(1-\epsilon)f+\epsilon g$ and $g$ represents the unknown contamination distribution. We investigate the effects of contamination proportion $\epsilon$ among other key quantities on the corresponding minimax rates of convergence for both structured and unstructured contamination classes: for structured contamination, $\epsilon$ always appears linearly in the optimal rates while for unstructured contamination, the leading term of the optimal rate involving $\epsilon$ also relies on the smoothness of target density class and the specific loss function. We further carefully study the corresponding adaptation theory in contamination models. Two different Goldenshluger-Lepski-type methods are proposed to select bandwidth and achieve $L_p$ risk oracle inequalities for structured and unstructured contaminations respectively. It is shown that the proposed procedures lead to minimax rate-adaptivity over a scale of the anisotropic Nikol‟skii classes for most scenarios

except that adaptation to both contamination proportion $\epsilon$ and smoothness of density class for unstructured contamination is shown to be impossible. Our technical analysis in adaptive procedures relies on some uniform bounds under the $L_p$ norm of empirical processes developed by Goldenshluger and Lepski. Key Words: adaptivity; minimax rate; contamination; robust statistics; nonparametric density estimation

Page 66: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

61

Dynamic Spatial Panel Data Models with Endogeneity and

Common Factors

Wei Shi1

1 Institute for Economic and Social Research, Jinan University∗

E-mail: [email protected]

Abstract: Spatial interactions and common factors are two popular approaches to model cross sectional dependencies, which reflects local and global dependence respectively. Recent literature have investigated models with both spatial interactions and common factors in a dynamic panel data setup with both large n and T. In many applications of these models, some explanatory variables may be endogenous and the spatial weight matrices may be stochastic and based on variables that may also be endogenous. Using a control function approach to address the endogeneity, this paper proposes a QML estimator and provides conditions for its consistency and asymptotic normality. Due to the presence of predetermined terms and common factors, the estimator may have an asymptotic bias and an analytical bias correction procedure is provided. We examine the finite sample behavior of the estimator in a set of Monte Carlo simulations and apply the model to study the effect of cigarette taxation on demand when cross state shopping is present. Key Words: Spatial panel data; endogenous spatial weighting matrix; multiplicative individual and time effects; QMLE

Page 67: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

62

Bridging the gap between noisy healthcare data and

knowledge: automated translation of medical terminology

Xu Shi1,*

, Xiaoou Li2, Tianxi Cai

1

1Department of Biostatistics, Harvard University, USA

2Department of Statistics, University of Minnesota, USA

E-mail: [email protected]

Abstract: Routinely collected healthcare data present numerous opportunities for biomedical research but also come with unique challenges. In this talk, we detail the challenge of inconsistent “languages” used by different healthcare systems and coding

systems. In particular, different healthcare providers may use alternative medical codes to record the same diagnosis or procedure, limiting the transportability of phenotyping algorithms and statistical models across healthcare systems. We present an automated data quality control pipeline that aims to address this challenge and make the transition from data to knowledge. We formulate the idea of medical code translation into a statistical problem of inferring a mapping between two sets of multivariate, unit-length vectors learned from two healthcare systems, respectively. The statistical problem is particularly interesting because the training data is corrupted by a fraction of mismatch in the response-predictor pairs, whereas classical regression analysis tacitly assumes that the response and predictor are correctly linked. We propose a novel method for mapping recovery and establish theoretical guarantees for estimation and model selection consistency. Key Words: electronic health records; mismatched data; ontology translation; spherical regression

Page 68: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

63

Estimating the sample mean and standard deviation from

the five-number summary and their applications in

evidence-based medicine

Tiejun Tong1

1 Department of Mathematics, Hong Kong Baptist University, HKSAR

E-mail: [email protected]

Abstract: Evidence-based medicine is attracting increasing attention to improve decision making in medical practice via integrating evidence from well designed and conducted clinical research. Meta-analysis is a statistical technique widely used in evidence-based medicine for analytically combining the findings from independent clinical trials to provide an overall estimation of a treatment effectiveness. The sample mean and standard deviation are two commonly used statistics in meta-analysis but some trials use the median, the minimum and maximum values, or sometimes the first and third quartiles to report the results. Thus, to pool results in a consistent format, researchers need to transform those information back to the sample mean and standard deviation. In this talk, I will introduce our recent advances in the optimal estimation of the sample mean and standard deviation for meta-analysis from both theoretical and empirical perspectives. Specifically, we solve the problems by incorporating the sample size in a smoothly changing weight in the estimators to reach the optimal estimation. Our proposed estimators not only improve the existing ones significantly but also share the same virtue of the simplicity. The real data application indicates that our proposed estimators are capable to serve as „„rules of thumb‟‟ and will be widely

applied in evidence-based medicine. Key Words: Median; meta-analysis; mid-range; mid-quartile range; optimal weight; sample mean; sample size

Page 69: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

64

An efficient ADMM algorithm for high dimensional

precision matrix estimation via penalized quadratic loss

Cheng Wang 1,Binyan Jiang

2,*

1 School of Mathematical Sciences, MOE-LSC, Shanghai Jiao Tong University,

Shanghai 200240, China. 2

Department of Applied Mathematic\The Hong Kong Polytechnic University, Hung

Hom, Kowloon, Hong Kong

E-mail: [email protected]

Abstract: The estimation of high dimensional precision matrices has been a central topic in statistical learning. However, as the number of parameters scales quadratically with the dimension $p$, many state-of-the-art methods do not scale well to solve problems with a very large $p$. In this paper, we propose a very efficient algorithm for precision matrix estimation via penalized quadratic loss functions. Under the high dimension low sample size setting, the computation complexity of our algorithm is linear in both the sample size and the number of parameters. Such a computation complexity is in some sense optimal, as it is the same as the complexity needed for computing the sample covariance matrix. Numerical studies show that our algorithm is much more efficient than other state-of-the-art methods when the dimension $p$ is very large. Key Words: High dimension; Penalized quadratic loss; Precision matrix

Page 70: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

65

MOSUM-based test and estimation method for multiple

changes in panel data

Man Wang1, Rongmao Zhang

2, Ngaihang Chan

3,*

1Department of Finance, Donghua University, Shanghai, China

2School of Mathematics, Zhejiang University, Hangzhou, China

3Department of Statistics, The Chinese University of Hong Kong, Shatin, HKSAR

E-mail: [email protected]

Abstract: Detection of common changes in panel data is a vibrant topic in both econometrics and statistics. However, most existing testing methods suffer from power loss under certain configurations of multiple change points. To solve this problem, in this paper, a moving sum (MOSUM) based test is proposed for detecting the multiple changes in panel data. Under mild conditions, it is shown that the proposed test statistic converges to an extreme distribution of a Gaussian process under the null hypothesis of no change, and diverges to infinity under the alternative hypothesis of multiple changes. Additionally, based on the MOSUM test, an estimation method which estimates the locations of all the change points simultaneously is given and its consistency is established. To illustrate the performance of the proposed testing and estimation method, a number of simulation studies have been conducted. The simulation result shows that the proposed MOSUM based test outperforms most existing cumulative sum (CUSUM) based procedures under multiple changes setting, and the estimation method has satisfying performance. The test and estimation method is applied to the US state-level personal income data. The testing result shows that there exist common structural breaks in the growth rate of personal income of the 50 states, and the estimated five change points accord with the business cycle behavior of the US. Key Words: MOSUM; change point; panel data

Page 71: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

66

Structured tensor decomposition and its application

Miaoyan Wang1

1Second University of Wisconsin – Madison, USA

E-mail: [email protected]

Abstract: Tensors of order 3 or greater, known as higher-order tensors, have recently attracted increased attention in many fields. Methods built on tensors provide powerful tools to capture complex structures in data that lower-order methods may fail to exploit. However, extending familiar matrix concepts to higher-order tensors is not straightforward, and indeed it has been shown that most computational problems for tensors are NP-hard. In this talk, I will present some statistical results on tensors decomposition. We focus on high-dimensional tensors with special structures, such as low-rankness, sparsity, multi-way blocks, and binary-valued tensors. Such problems arise in several applications such as collaborative filtering, compressed sensing, sensor network localization, and topic modeling. We propose proper loss functions and give the performance bound under generalized multilinear models. We demonstrate the power of our approach on the tasks of tensor completion and clustering, with improved performance over previous methods. Key Words: High dimensionality; higher-order tensors; CP tensor decomposition

Page 72: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

67

Identification of the number of factors for factor modeling in

high dimensional time series

Qinwen Wang1

1Fudan University, Shanghai, China

E-mail: [email protected]

Abstract: Identifying the number of factors in a high-dimensional factor model has attracted much attention in recent years and a general solution to the problem is still lacking. A promising ratio estimator based on the singular values of the lagged autocovariance matrix has been recently proposed in the literature and is shown to have a good performance under some specific assumption on the strength of the factors. Inspired by this ratio estimator and as a first main contribution, we will propose a complete theory of such sample singular values for both the factor part and the noise part under the large-dimensional scheme where the dimension and the sample size proportionally grow to infinity. In particular, we provide the exact description of the phase transition phenomenon that determines whether a factor is strong enough to be detected with the observed sample singular values. Based on these findings, we propose a new estimator of the number of factors which is strongly consistent for the detection of all significant factors (which are the only theoretically detectable ones). In particular, factors are assumed to have the minimum strength above the phase transition boundary which is of the order of a constant; they are thus not required to grow to infinity together with the dimension (as assumed in most of the existing papers on high-dimensional factor models). Key Words: high-dimensional factor model; autocovariance matrix; singular values

Page 73: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

68

Large Multiple Graphical Model Inference via Bootstrap

Shaoli Wang1, Yongli Zhang, Xiaotong Shen

1 Shanghai University of Finance and Economics, China

E-mail: [email protected]

Abstract: Large economic and financial networks may experience stage-wise change as a result of external shocks. To detect and infer a structural change, we consider an inference problem in the framework of multiple Gaussian graphical models when the number of graphs and the dimension of graphs increase with the sample size. In this setting, two major challenges emerge as a result of the bias and uncertainty inherent in regularization required to treat such overparameterized models. To deal with these challenges, bootstrap is utilized to approximate the sampling distribution of a likelihood ratio test statistic. We show theoretically that the proposed method leads to a correct asymptotic inference in a high-dimensional setting regardless of the distribution of the test statistic. Simulations show that the proposed method compares favorably to its competitors such as the Likelihood Ratio Test. An application is given to analyze a network of 200 stocks.

Page 74: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

69

An adaptive independence test for microbiome community

data

Tao Wang1, Yaru Song

1, Hongyu Zhao

2,*

1Shanghai Jiao Tong University, Shanghai, China

2Yale University, New Haven, USA

E-mail: [email protected]

Abstract: Advances in sequencing technologies and bioinformatics tools have vastly improved our ability to collect and analyze data from complex microbial communities. A major goal of microbiome studies is to correlate the overall microbiome composition with clinical or environmental variables. La Rosa et al. (2012) recently proposed a parametric test for comparing microbiome populations between two or more groups of subjects. However, this method is not applicable for testing the association between the community composition and a continuous outcome. Although multivariate non-parametric methods based on permutations are widely used in ecology studies, they lack interpretability and can be inefficient for analyzing microbiome data. We consider the problem of testing for independence between the microbial community composition and a continuous or many-valued variable. By partitioning the range of the variable into a few slices, we formulate the problem as a problem of comparing multiple groups of microbiome samples, with each group indexed by a slice. To model multivariate and over-dispersed count data, we use the Dirichlet-multinomial distribution. We propose an adaptive likelihood-ratio test by learning a good partition or slicing scheme from the data. A dynamic programming algorithm is developed for numerical optimization. We demonstrate the superiority of the proposed test by comparing it to that of La Rosa et al. (2012) and popular approaches on the same topic including PERMANOVA, the distance covariance test, and the microbiome regression-based kernel association test. We further apply it to test the association of gut microbiome with age in three geographically distinct populations, and show how the learned partition facilitates differential abundance analysis. Key Words: Adaptive slicing; Community-level analysis; Dierential abundance testing; Distance-based methods; Penalization

Page 75: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

70

Integrated Quantile Rank Test (iQRAT) for heterogeneous

joint effect of rare and common variants in sequencing

studies

Tianying Wang, Iuliana Ionita-Laza, Ying Wei

Department of Biostatistics, Columbia University, New York, NY

E-mail: [email protected]

Abstract: Genetic association studies often evaluate the combined group-wise effects of rare and common genetic variants on phenotype at gene levels. Many approaches have been proposed for group-wise association tests, such as the widely used burden tests and sequence kernel association tests. Most of these approaches focus on identifying mean effects. As the genetic associations are complex, we propose an efficient integrated rank test to investigate the genetic effect across the entire distribution/quantile function of a phenotype. The resulting test complements the mean-based analysis and improve efficiency and robustness. The proposed test integrates the rank score test statistics over quantile levels while incorporating Cauchy combination test scheme and Fisher's method to maximize the power. It generalized the classical quantile-specific rank-score test. Using simulations studies and real Metabochip data on lipid traits, we investigated the performance of the new test in comparison with the burden tests and sequence kernel association tests in multiple scenarios. Key Words: Quantile process; Association test; Sequencing analysis; Joint effects; Rare variants

Page 76: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

71

Model Free Approach to Quantifying the Proportion of

Treatment Effect Explained by a Surrogate Marker

Xuan Wang1,∗, Layla Parast

2, Lu Tian

3, Tianxi Cai

4

1,∗School of Mathematical Sciences, Zhejiang University, Hangzhou, Zhejiang, China 2 Statistics Group, RAND Corporation, Santa Monica, California 90401 U.S.A.

3 Department of Biomedical Data Science, Stanford University, Stanford, California

94305 U.S.A. 4 Department of Biostatistics, Harvard University, Boston, Massachusetts 02115 U.S.A.

E-mail: [email protected] Abstract: In randomized clinical trials, the primary outcome, Y , often requires long term follow-up and/or is costly to measure. For such settings, it is desirable to use a surrogate marker, S, to infer the treatment effect on Y , ∆. Identifying such an S and quantifying the proportion of treatment effect on Y explained (PTE) by the effect on S are thus of great importance. Most existing methods for quantifying the PTE are model-based and may yield biased estimates under model mis-specification. Recently proposed non-parametric methods require strong assumptions to ensure that the PTE is be-tween [0, 1]. Additionally, optimal use of S to approximate ∆ is especially important when S relates to Y non-linearly. In this paper, we identify an optimal transformation of S, gopt such that the PTE can be inferred based on gopt(S). In addition, we provide two novel model free definitions of PTE and simple conditions for ensuring the PTE is between [0, 1]. We provide non-parametric estima-tion procedures and establish asymptotic properties of the proposed estimators. Simulation studies demonstrate that the proposed methods perform well in finite samples. We illustrate the proposed procedures using a randomized study of HIV patients. Key Words: Non-parametric estimation; Proportion of treatment effect explained; Randomized clin- ical trial; Surrogate marker.

Page 77: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

72

Scattering Transform and Stylometry Analysis in Arts

Yang Wang1

1 Department of Mathematics, Hong Kong University of Science and Technology

E-mail: [email protected]

Abstract: With the rapid advancement in data analysis and machine learning, stylometry analysis in arts has gained considerable interest in recent years. A fundamental topic of research in stylometry is the detection of art forgery. But unlike many other machine learning applications, we typically face the challenge of not having enough data. In this talk I‟ll discuss how scattering transform can be applied to

stylometry analysis, and demonstrate its effectiveness on Van Gogh paintings as well as another data set.

Page 78: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

73

A Fast and Practical Randomized Method for Low-Rank

Tensor Approximations

Yao Wang1,∗

1,∗ Xi’an Jiaotong University, 710049, P.R. China

E-mail: [email protected]

Abstract: Low-rank tensor approximations have gained much attention in dealing with real-world applications such as dynamic medical image processing and multi-channel video analysis, because of their efficiency in exploiting the intrinsic structures of the data with limited parameters. Unfor- tunately, the popular tensor decompositions that can get efficient low rank approximations, namely Tucker decomposition and tensor Singular Value Decomposition (t-SVD), for computing many SVDs are prohibitively computationally expensive in general, which obviously limits their use in “big data”

environments. To remedy such issue, in this work, we present a randomized URV decomposition for producing fast and efficient low rank tensor approximations with theoretical guarantees by us- ing Tucker decomposition and t-SVD. To be more precise, our method incorporates a strong rank- revealing QR decomposition that can make the computations of Tucker decomposition and t-SVD to be more stable. We then justify the effectiveness of the obtained low rank tensor approximations through a series of synthetic data experiments and several real-world applications. The extensive ex- perimental results demonstrate the superior performance of our procedures over the existing methods in terms of both robustness and computational speed. Key Words: Low-rank tensor approximations; Tucker decomposition; t-SVD; URV decomposition; randomized algorithms

Page 79: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

74

The traps that must be encountered in machine learning

practices

Hu Wei 1

1 Yingying Group, Inc., China

E-mail: [email protected]

Abstract: In machine learning practices, we often encounter the following difficulties: inconsistency between data samples and application scenarios, discrepancy between model training and application environment, systematic missing and variation of data, censoring of observed samples, and so on. These problems vary in different businesses, but they remain essentially the same in all respects, such as data, features and model methods. Based on several case studies in the field of Internet on the applications of cutting-edge machine learning technology, the presentation helps understand how to analyze data, select assumptions suitable for data, and then design the modeling process of data preprocessing, algorithm selection and optimization, model evaluation and deployment.

Page 80: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

75

Flexible Experimental Designs for Valid Single-cell

RNA-sequencing Experiments Allowing Batch Effects

Correction

Yingying Wei 1, Ga Ming Chan

2, Fangda Song

3

1 Department of Statistics, The Chinese University of Hong Kong, HKSAR

E-mail: [email protected]

Abstract: Despite their widespread applications, single-cell RNA-sequencing (scRNA-seq) experiments are still plagued by batch effects and dropout events. Although the completely randomized experimental design has frequently been advocated to control for batch effects, it is rarely implemented in real applications due to time and budget constraints. Here, we mathematically prove that under two more flexible and realistic experimental designs---the "reference panel" and the "chain-type" designs---true biological variability can also be separated from batch effects. We develop Batch effects correction with Unknown Subtypes for scRNA-seq data (BUSseq), which is an interpretable Bayesian hierarchical model that closely follows the data-generating mechanism of scRNA-seq experiments. BUSseq can simultaneously correct batch effects, cluster cell types, impute missing data caused by dropout events, and detect differentially expressed genes without requiring a preliminary normalization step. We demonstrate that BUSseq outperforms existing methods with simulated and real data. Key Words: Batch effects; Experimental design; Single-cell RNA-seq experiments; Model-based clustering; Integrative analysis

Page 81: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

Recent developments in graph matching: statistical analysisJian Ding1∗, Zongming Ma1, Yihong Wu2, Jiaming Xu3∗,

1∗ Department of Statistics, The Wharton School, University of Pennsylvania,Philadelphia, USA

2 Department of Statistics and Data Science, Yale University, New Haven, USA3 The Fuqua School of Business, Duke University, Durham, USA

E-mail: [email protected]

Abstract: Random graph matching refers to recovering the underlying vertex correspondence be- tween two random graphs with correlated edges; a prominent example is when the two random graphsare given by Erdos-Renyi graphs G(n, d ). This can be viewed as an average-case and noisy version of

n

the graph isomorphism problem. Under this model, the maximum likelihood estimator is equivalent˜to solving the intractable quadratic assignment problem. This work develops an O(nd2 + n2)-time al-

gorithm which perfectly recovers the true vertex correspondence with high probability, provided that the average degree is at least d = Ω(log2 n) and the two graphs differ by at most δ = O(log−2(n))fraction of edges. For dense graphs and sparse graphs, this can be improved to δ = O(log−2/3(n)) andδ = O(log−2(d)) respectively, both in polynomial time. The methodology is based on appropriately chosen distance statistics of the degree profiles (empirical distribution of the degrees of neighbors). Before this work, the best known result achieves δ = O(1) and no(1) ≤ d ≤ nc for some constant c

˜ ˜with an nO(log n)-time algorithm and δ = O((d/n)4) and d = Ω(n4/5) with a polynomial-time algo- rithm.

Key Words: Graph matching, Degree profiles, Quadratic assignment problem, Random graph iso- morphism, Erdos-Renyi graphs

76

Page 82: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

77

Differential Markov Random Field Analysis

Yin Xia1, Tony Cai

2, Hongzhe Li

3, Jing Ma

3

1Department of Statistics, Fudan University, China

2Department of Statistics, University of Pennsylvania, USA

3Department of Biotatistics, University of Pennsylvania, USA

E-mail: [email protected]

Abstract: In this talk, we propose a flexible Markov random field model for learning the microbial community structure and introduce a testing framework for detecting the difference between networks, also known as differential network analysis. Our global test for differential networks is particularly powerful against sparse alternatives. In addition, we develop a multiple testing procedure with false discovery rate control to infer the structure of the differential network. The proposed method is applied to a gut microbiome study on UK twins to detect the microbial interactions associated with the age of the host. Key Words: Differential network; High dimensional logistic regression; Microbiome; Multiple testing

Page 83: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

78

Building a Translational Research Program in

Neuroinflammation: A Data Driven Approach to Advance

Precision Medicine for Multiple Sclerosis

Zongqi Xia 1,2

, Liang Liang 3, Tianrun Cai1

4, Kumar Dahal

5, Chen Lin

5, Sean

Finan5, Guergana Savovoa

5, Tanuja Chitnis

6, Howard Weiner

6 Philip De

Jager2,7

,Tianxi Cai3

1Department of Neurology, University of Pittsburgh, Pittsburgh, PA, USA

2Cell Circuits Program, Broad Institute, Cambridge, MA, USA

3Department of Statistics, Harvard School of Public Health, Boston, MA, USA

4Department of Rheumatology, Brigham and Women’s Hospital, Boston, MA, USA

5Clinical Natural Language Processing Program, Boston Children’s Hospital, Boston,

MA, USA 6Department of Neurology, Brigham and Women’s Hospital, Boston, MA, USA

7Center for Translational & Computational Neuroimmunology and the Columbia MS

Center, Department of Neurology, Columbia University Medical Center, New York

City, NY, USA

E-mail: [email protected]

Abstract: Multiple sclerosis (MS) is a chronic neurological disease with a disproportionally high socioeconomic burden. Given the wide spectrum of disease trajectories among people with MS and their diverse responses to treatments, there is unmet need to bring precision medicine to MS. For this presentation, I will primarily discuss our ongoing efforts in developing analytical approaches to ascertain disease activity and predict treatment response using electronic health records data. Tools that leverage real-life clinical data for outcome prediction in chronic neurological disorders have the potential for widespread dissemination at the point of care. I will additionally highlight complementary research strategies of leveraging prospective cohort studies to investigate MS onset and disease evolution as part of a broader program to advance precision medicine for MS. Key Words: multiple sclerosis; precision medicine; electronic health records; disease activity; treatment response

Page 84: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

79

Realized volatility forecasting with HAR-GARCH type

model: a Bayesian approach

Yunxian Li 1,∗, Han Xiang

2

1,∗ Yunnan University of Finance and Economics, 650221, China

2Yunnan University of Finance and Economics, 650221, China

E-mail: [email protected]

Abstract: In this paper, HAR-GARCH-type models are proposed to analyze crude oil price volatili- ties. Bayesian approaches, include Bayesian model estimation and model comparison are developed for HAR-GARCH-type models. MCMC methods are applied to get the Bayesian estimation of the unknown parameters of the proposed model. A Bayesian criterion-based statistic, called Lv mea- sure is proposed as model comparison statistic for HAR-GARCH-type model. In addition, Bayesian forecasting is also discussed in this paper. According to the proposed models and methodologies, 5-minute price data of WTI crude Oil Futures is analyzed. Different models, including HAR-type model and HAR-GARCH-type model are considered and compared. According to Lv measure, HAR-GARCH-type model performs better then HAR-type model. It is reasonable to consider HAR- GARCH-type model as heterogeneous is existed in the realized volatilities. Bayesian estimation and model comparison are discussed for the HAR-GARCH-type model. The future realized volatilities are predicted based on the selected HAR-GARCH model.

Key Words: Oil price; Realized volatility; Model selection; Bayesian approach

Page 85: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

80

TBA

Han XIAO1

1 Rutgers University, USA

E-mail: [email protected]

Abstract: TBA

Page 86: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

81

Multiple Testing Embedded in an Aggregation Tree to

Identify where Two Distributions Differ

Jichun Xie 1

1 Duke University,Duke Box 103854, Durham, NC

E-mail: [email protected]

Abstract: A key goal of flow cytometry data analysis is to identify the subpopulation of cells whose attributes are responsive to the treatment. These cells are supposed to be sparse among the entire cell population. To identify them, we propose a novel multiple TEsting on the Aggregation tree Method (TEAM) to locate where the treated and the reference distributions differ. TEAM has a bottom-up hierarchical framework. On the bottom layer, we search for the short-range spiky distributional differences in the small bins; while on the higher layers, we search for the long-range weak distributional differences. The active testing sets and the rejection rule on the higher layer will depend on the testing results of the lower layers. Under the mild conditions, we proved that team will yield FDP converging to the desired level. Extensive simulations show that the proposed method is valid and has much better power compared to the single-layer multiple testing methods and the multi-resolution scanning method. We then apply team to a flow cytometry study where we successfully identified the cell subpopulation that is responsive to the cytomegalovirus antigen. Key Words: Hierarchical multiple testing; aggregation tree; equal-power binning; false discovery rate (FDR); distribution difference; flow cytometry (FCM)

Page 87: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

82

Pearson's statistics: approximation theory and beyond

Mengyu Xu1, Danna Zhang

2*, Wei Biao Wu

3,

1Department of Statistics, University of Central Florida, 32816, U.S.A.

2Department of Mathematics, University of California, San Diego, 92093, U.S.A.

3Department of Statistics, University of Chicago, 60637, U.S.A.

E-mail: [email protected]

Abstract: We establish an approximation theory for Pearson's chi-squared statistics in situations where the number of cells is large, by using a high-dimensional central limit theorem for quadratic forms of random vectors. Our high-dimensional central limit theorem is proved under Lyapunov-type conditions that involve a delicate interplay between the dimension, the sample size and the moment conditions. We propose a modified chi-squared statistic and the concept of adjusted degrees of freedom. Our simulation study shows the modified statistic outperforms Pearson's chi-squared statistic in both size accuracy and power. Our procedure is applied to the construction of a goodness-of-fit test for Rutherford's alpha particle data. Key Words: Adjusted degrees of freedom; goodness-of-fit test; invariance principle; large p small n; Pearson‟s chi-squared statistic

Page 88: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

83

Distribution and correlation free two-sample test of

high-dimensional means

Kaijie XUE1

1 University of Toronto, Canada

E-mail: [email protected]

Abstract: We propose a two-sample test for high-dimensional means that requires neither distributional nor correlational assumptions, besides some weak conditions on the moments and tail properties of the elements in the random vectors. This two-sample test based on a nontrivial extension of the one-sample central limit theorem (Chernozhukov et al., 2017) provides a practically useful procedure with rigorous theoretical guarantees on its size and power assessment. In particular, the proposed test is easy to compute and does not require the independently and identically distributed assumption, which is allowed to have different distributions and arbitrary correlation structures. Further desired features include weaker moments and tail conditions than existing methods, allowance for highly unequal sample sizes, consistent power behavior under fairly general alternative, data dimension allowed to be exponentially high under the umbrella of such general conditions. Simulated and real data examples are used to demonstrate the favorable numerical performance over existing methods.

Page 89: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

84

Page 90: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

85

A statistical and machine learning framework for new

energy vehicle ride sharing system

Kaixian Yu3,*

, Jinliang Deng1,*

, Chengchun Shi2,*

, Rui Song2, Qiang Yang

1,

Jieping Ye3, Hongtu Zhu

3

1Department of Computer Science and Engineering, Hong Kong University of Science

and Techology, Hong Kong, China 2Department of Statistics, North Carolina State University, Raleigh, NC, US

3AI Labs, Didi Chuxing, Beijing, China

*These authors contribute equally

E-mail: [email protected] Abstract: Recently, the number of electric vehicles (EVs) served on the online ride-hailing companies, like Uber, Didi Chuxing, increased rapidly. Not like conventional fuel vehicles, EVs have some unique characteristics: they do not travel as far as fuel vehicles, and it takes much longer for EVs to be charged. Adapting these characteristics into the dispatching system of online ride-hailing companies becomes increasingly important. In this talk, we will present our recent progress on two major components of an EV friendly dispatching system. Firstly, we will introduce a stochastic partial differential equation approach to model the power consumption by an EV. The power consumption model takes real time vehicle and environment factors into account to estimate the state of charge. Secondly, we will introduce a deep multi-objective reinforcement learning approach to solve the order dispatching problem based on the estimated state of charge of EVs. Some results on real data and simulated system will be shown as well. Key Words: Applied statistics; new energy vehicle; online ride-hailing platform; stochastic differential equation; deep multi-objective reinforcement learning

Page 91: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

86

Word Segmentation and Term Discovery in Chinese

Electronic Medical Records Using Graph Theory and Deep

Learning

Zheng Yuan1#

, Yuanhao Liu2#

, Qiuyang Yin3, Boyao Li

4, Sheng Yu

1*

1Center for Statistical Science, Tsinghua University, Beijing, China;

2Department of Statistics, University of Michigan, Ann Arbor, MI, USA;

3Department of Automation, Tsinghua University, Beijing, China;

4Department of Physics, Tsinghua University, Beijing, China;

# contributed equally

E-mail: [email protected]

Abstract: Natural language processing (NLP) for electronic medical records (EMR) generally requires a comprehensive medical dictionary that covers both standard terminology, term variations, and abbreviations. In this work, we present an automated method to discover medical terms from Chinese EMR notes, innovatively using both graph theory and deep learning. The automatic Chinese medical term discovery pipeline has two steps. The first step is word segmentation. We propose a graph theory-based unsupervised word segmentation algorithm that considers the sentence as an undirected graph, whose nodes are the characters. One can use various techniques to compute the edge weights that measure the connection strength between characters. Spectral graph partition algorithms are used to group the characters and achieve word segmentation. Segmenting the EMR corpus will provide a list of candidate medical terms. The word segmentation result may contain errors, i.e., the word boundary may be put at wrong places. Thus in the second step, we train a bi-directional LSTM neural network discriminator to remove wrong segmentation results. The model input includes the candidate term and ±4 characters around the term in the text. The neural network uses an embedding layer for both the characters and the forward/backward positions. We created training data by simulation. Positive samples are terms identified using a dictionary-based segmenter loaded with a general domain dictionary and a Chinese-English medical dictionary. Negative samples are obtained by randomly adding/removing 1 or 2 characters to/from either end of a positive sample to imitate segmentation errors. The trained discriminator is applied to the segmentation result. Terms repeatedly appear in the EMR corpus are classified repeatedly. Terms not in the general domain dictionary and accepted over 10 times by the model are kept as predicted medical terms.

Key Words: electronic medical records; word segmentation; term discovery; deep learning; graph partition

Page 92: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

87

Statistical Inference Based on Sufficient Dimension

Reduction

Zhou Yu1

1 East China Normal University, China

E-mail: [email protected]

Abstract: TBA

Page 93: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

88

Global Optimality of Stochastic Semi-definite Optimization

with Application to Ordinal Embedding

Jinshan Zeng1,2∗, Ke Ma

3, Yuan Yao

2

1 School of Computer Science, Jiangxi Normal University, Nanchang, China

2 Department of Mathematics, Hong Kong University of Science and Technology,

HKSAR

3 Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China

E-mail: [email protected]

Abstract: Nonconvex reformulations via low-rank factorization for stochastic convex semi-definite optimization problem has attracted arising attention due to its empirical efficiency and scalability. However, it opens a new challenge that under what conditions the nonconvex stochastic algorithms may find the population minimizer within the optimal statistical precision despite their empirical suc- cess in applications. In this talk, we provide an answer that the stochastic gradient descent (SGD) method can be adapted to solve the nonconvex reformulation of the original convex problem, with a global linear convergence when using a fixed step size, i.e., converging exponentially fast to the population minimizer within an optimal statistical precision, at a proper initial choice in the restricted strongly convex case. If a diminishing step size is adopted, the bad effect caused by the variance of gradients on the optimization error can be eliminated but the rate is dropped to be sublinear. Then we propose an accelerated stochastic algorithm, i.e., SVRG and establish its global linear convergence. Finally, we apply our developed stochastic algorithms to the ordinal embedding problem and demon- strate their effectiveness.

Key Words: Stochastic gradient descent; SVRG; semidefinite optimization; low-rank factorization; ordinal embedding

Page 94: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

89

High-dimensional Tensor Regression Analysis

Anru Zhang1

1University of Wisconsin-Madison, Madison WI, USA

E-mail: [email protected]

Abstract: The past decade has seen a large body of work on high-dimensional tenors or multiway arrays that arise in numerous applications and have been applied to many statistical problems. In many of these settings, the tensor of interest is high-dimensional in that the ambient dimension is substantially larger than the sample size. Oftentimes, however, the tensor comes with natural low-rank or sparsity structure. How to exploit such structure for tensors poses new statistical and computational challenges. In this talk, we introduce a novel procedure for low-rank tensor regression, namely Importance Sketching Low-rank Estimation for Tensors (ISLET), which addresses these challenges. The central idea behind ISLET is what we call importance sketching, carefully designed structural sketches based on higher order orthogonal iteration (HOOI) and combining sketched estimated components using the recently developed Cross procedure. We show that our estimating method is sharply minimax optimal in terms of the mean-squared error under low-rank Tucker assumptions. In addition, if a tensor is low-rank with group sparsity, our procedure also achieves minimax optimality. Further, we show through numerical study that ISLET achieves comparable mean-squared error performance to existing state-of-the-art methods whilst having substantial storage and run-time advantages. In particular, our procedure performs reliable tensor estimation with tensors of dimension p = O (10^8) and is 1 or 2 orders of magnitude faster than baseline methods. Key Words: Dimension reduction; high-order orthogonal iteration; minimax optimality; sketching; tensor regression

Page 95: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

90

Enhanced Pulmonary Nodule Detection Using Fully Automated

Deep Learning: A Multifactor Investigation

Chi Zhang1∗, Jiechao Ma

2, Shiyuan Liu

3

1∗ Presenter, Beijing Infervision Inc., 100085, China 2

Beijing Infervision Inc., 100085, China 3

Changzheng Hospital, Second Military Medical University, 200003, China

E-mail: [email protected] Abstract: Neural network based deep learning (DL) algorithms have been successfully used in the detection of lung nodules in CT scans, improving efficiency while reducing burden on radiologists. We compared detection sensitivity of DL model with radiologists, and verified whether DL models could assist radiologists to enhance baseline screening detection. In this retrospective study, we compared the false discovery rate (FDR) and localization receiver operative character (LROC) curves between performance of radiologist only or with DL model assistance. The results showed that for all cohorts the DL model showed higher overall sensitivity than manual detection and is insensitive to radiation dose, patient age or CT equipment. It is also shown able to enhance manual screening, which may lead to better pulmonary nodule management.

Key Words: lung cancer screening; pulmonary nodule detection; deep learning; computer-aided detection

Page 96: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

91

Heteroscedasticity test based on high-frequency data with

jumps and microstructure noise

Qiang Liu1, Zhi Liu

2, Chuanhai Zhang

3

1 Department of Mathematics, National University of Singapore, Singapore 2

Department of Mathematics, University of Macau, Macau SAR, China 3 School of Finance, Zhongnan University of Economics and Law, Wuhan 430073,

China

E-mail:[email protected]

Abstract: In this paper, we are interested in testing if the volatility process is constant or not during a given time span by using high-frequency data, when considering the possible presence of jumps and microstructure noise. Bases on the estimators of integrated volatility and spot volatility, we propose a new nonparametric way to depict the discrepancy between local variation and global variation. We show that our test estimator converges to a standard normal distribution if the volatility is constant, otherwise it diverges to infinity. Simulation studies verify our theoretical results and show the good finite sample performance of our test procedure. We also apply our estimator to do the heteroscedasticity test for some real financial high-frequency data. Key Words: High-frequency data; Market microstructure noise; jumps; Heteroscedasticity; Nonparametric test; Integrated volatility; Spot volatility

Page 97: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

92

Factor Models for High-Dimensional Tensor Time Series

Cun-Hui Zhang1

1Rutgers University, USA

E-mail: [email protected]

Abstract: Large tensor data are now routinely collected in a wide range of applications due to rapid development of information technologies and their broad implementation in our era. Often such observations are taken over time, forming tensor time series. In this paper we present a factor model approach for analyzing high-dimensional dynamic tensor time series and multi-category dynamic transport networks. Two estimation procedures are presented along with their theoretical properties and simulation results. Real applications are used to illustrate the model and its interpretations. This talk is based on joint work with Rong Chen and Dan Yang. Key Words: tensor; time series; factor model; loading matrix; decomposition

Page 98: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

93

Stochastic differential reinsurance games with capital

injections

Nan Zhang1, Zhuo Jin

2, Linyi Qian

3, Kun Fan

4

1,3,4 East China Normal University, Shanghai 200062, China

2 The University of Melbourne, VIC 3010, Australia

E-mail: [email protected]

Abstract: This paper investigates a class of reinsurance game problems between two insurance companies under the framework of non-zero sum stochastic differential games. Both insurers can purchase proportional reinsurance contracts from reinsurance markets and have the option of determining the time and amount of capital injections, which is described by impulse controls. We assume the reinsurance premium is calculated under the generalized variance premium principle. The objective of each insurer is to maximize the expected value that synthesizes the discounted utility of its surplus relative to a reference point, the penalties caused by its capital injection interventions, and the gains brought by capital injections of his competitors. We prove the verification theorem and derive explicit expressions of the Nash equilibrium strategy by solving the corresponding quasi-variational inequalities. Numerical examples are also conducted to illustrate our results.

Key Words: Stochastic differential game, Impluse control, Nash equilibrium, Quasi-variational inequality

Page 99: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

94

Structured sparse logistic regression with application to lung

cancer prediction using breath volatile biomarkers

Xiaochen Zhang 1, Qingzhao Zhang

12, Xiaofeng Wang3, Shuangge Ma4 ,

Kuangnan Fang 1,*

1Department of Statistics, School of Economics, Xiamen University, China

2The Wang Yanan Institute for Studies in Economics, Xiamen University, China

3Department of Quantitative Health Sciences/Biostatistics Section Cleveland Clinic

Lerner Research Institute, Cleveland, OH, USA 4Department of Biostatistics, Yale School of Public Health, USA

E-mail: [email protected]

Abstract: This article is motivated by a study of lung cancer prediction using breath volatile organic compound (VOC) biomarkers, where the challenge is that the predictors include not only high-dimensional time-dependent or functional VOC features but also the time-independent clinical variables. We consider a high-dimensional logistic regression and propose two different penalties: group spline-penalty or group smooth-penalty to handle the group structures of the time-dependent variables in the model. The new methods have the advantage for the situation where the model coefficients are sparse but change smoothly within the group, compared with other existing methods such as the group lasso and the group bridge approaches. Our methods are easy to implement since they can be turned into a group minimax concave penalty problem after certain transformations. We show that our fitting algorithm possesses the descent property and leads to attractive convergence properties. The simulation studies and the lung cancer application are performed to demonstrate the accuracy and stability of the proposed approaches.

Key Words: group spline-penalty; group smooth-penalty; variable selection; time-dependent variables; high- dimensional data

Page 100: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

95

Simulated Distribution Based Learning for Non-regular and

Regular Statistical Inferences

Bingyan Wang1, Zhengjun Zhang

2,

1Peking University, China

2University of Wisconsin, USA

E-mail: [email protected]

Abstract: Statistical research involves drawing inference about unknown quantities (e.g., parameters) in the presence of randomness in which distribution assumptions of random variables (e.g., error terms in regression analysis) play a central role. However, a fundamental issue of preserving the distribution assumptions has been more or less ignored by many inference methods and applications. As a result, the further inference of studied problems and related decisions based on the estimated parameter values may be inferior. This paper proposes a continuous distribution preserving estimation approach for various kinds of non-regular and regular statistical studies. The paper establishes a fundamental theorem which guarantees the transformed order statistics (to a given marginal) from the assumed distribution of a random variable (or an error term) to be arbitrarily close to the order statistics of a simulated sequence of the same marginal distribution. Different from the Kolmogorov-Smirnov test which is based on absolute errors between the empirical distribution and the assumed distribution, the statistics proposed in the paper are based on relative errors of the transformed order statistics to the simulated ones. Upon using the constructed statistic (or the pivotal quantity in estimation) as a measure of the relative distance between two ordered samples, we estimate parameters such that the distance is minimized. Unlike many existing methods, e.g., maximum likelihood estimation, which rely on some regularity conditions and/or the explicit form of probability density function, the new method only assumes a mild condition that the cumulative distribution function can be approximated to a satisfied precision. The paper illustrates simulation examples to show its superior performance. Under the linear regression settings, the proposed estimation performs exceptionally well regarding preserving the error terms (i.e., the residuals) to be normally distributed which is a fundamental assumption in the linear regression theory and applications. Key Words: estimation; extreme value theory; inverse approximation of distributions; relative errors; simulation.

Page 101: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

96

Estimation and inference for the indirect effect in

high-dimensional linear mediation models

Ruixuan Zhou1, Liewei Wang

2, Sihai Dave Zhao

1,*

1Department of Statistics, University of Illinois at Urbana-Champaign, Champaign,

IL 61820, USA

2Division of Clinical Pharmacology, Department of Molecular Pharmacology and

Experimental Therapeutics, Mayo Clinic, Rochester, MN 55905, USA

E-mail: [email protected]

Abstract: Mediation analysis is difficult when the number of potential mediators is larger than the sample size. We propose new inference procedures for the indirect effect in the presence of high-dimensional mediators for linear mediation models. We develop methods for both incomplete mediation, where a direct effect may exist, as well as complete mediation, where the direct effect is known to be absent. We prove consistency and asymptotic normality of our indirect effect estimators. Under complete mediation, where the indirect effect is equivalent to the total effect, we further prove that our approach gives a more powerful test compared to directly testing for the total effect. We apply our method to an integrative analysis of gene expression and genotype data from a pharmacogenomic study of drug response. We present a novel analysis of gene sets to understand the molecular mechanisms of drug response, and also identify a genome-wide significant noncoding genetic variant that cannot be detected using standard analysis methods. Key Words: High-dimensional Inference; Integrative Genomics; Mediation Analysis

Page 102: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

97

Factor Modeling for Volatility

Xinghua Zheng 1, Yingying Li

2, Rob Engle

3, Yi Ding

4

1Hong Kong University of Science and Technology, HKSAR

3

New York University, USA

E-mail: [email protected]

Abstract: This talk consists of two parts. In the first part, under a high-frequency and high-dimensional setup, we establish a framework to estimate the factor structure in idiosyncratic volatility. We show that the factor structure can be consistently estimated by conducting principal component analysis on the idiosyncratic realized volatilities. Empirically, we confirm and identify the factor structure in idiosyncratic volatilities of SP 500 Index constituents. In the second part, motivated by strong empirical evidence, a single-factor volatility model is proposed. Empirical examination of the model reveals that the simple model well explains the co-movement feature of volatilities, and leads to substantial gain in volatility forecasting. Key Words: Volatility; Factor model; High-frequency data; principal component analysis

Page 103: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

98

Sequential scaled sparse factor regression

Zemin Zheng1

1 University of Science and Technology of China, China

E-mail: [email protected]

Abstract: TBA

Page 104: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

99

Estimating Endogenous Treatment Effect Using High-Dimensional

Instruments with an Application to the Olympic Effect

Wei Zhong1,∗, Wei Zhou2, Qingliang Fan2, Yang Gao2

1,∗ Xiamen University 2 Xiamen University

E-mail:[email protected]

Abstract: Endogenous treatments are commonly encountered in program evaluations using observa- tional data where the selection-on-observables assumption does not hold. In this paper, we develop a two-stage approach to estimate endogenous treatment effects using high-dimensional instrumental variables. In the first stage, instead of using a linear reduced form regression in the conventional two-stage least squares (TSLS) approach, we propose a new high-dimensional logistic reduced form model with the SCAD penalty to approximate the optimal instrument. In the second stage, we replace the original treatment variable by its estimated propensity score and run a least squares regression to obtain the penalized Logistic-regression Instrumental Variables Estimator (LIVE). We show that the proposed LIVE is root-n consistent to the true average treatment effect, asymptotically normal and achieves the semiparametric efficiency bound. Monte Carlo simulations demonstrate that the LIVE outperforms the traditional TSLS estimator and the post-Lasso estimator for the endogenous treat- ment effects. Moreover, in the empirical study, we investigate whether the Olympic Games could facilitate the host nation‟s economic growth using data from 163 countries. The

proposed LIVE esti- mator shows a strong Olympic effect on the host nation‟s economic growth. Key Words: Endogenous treatment effect; high dimensionality; instrumental variable; logistic regres- sion; variable selection

Page 105: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

100

Approximation Theory of Deep Convolutional Neural

Networks

Ding-XuanZhou1

1 School of Data Science, City University of Hong Kong, HKSAR

E-mail: [email protected]

Abstract: Deep learning has been widely applied and brought breakthroughs in speech recognition, computer vision, and many other domains. The involved deep neural network architectures and computational issues have been well studied in machine learning. But there lacks a theoreti-cal foundation for understanding the approximation or generalization ability of deep learning models with network architectures such as deep convolutional neural networks (CNNs) with convolutional structures. The convolutional architecture gives essential differences between the deep CNNs and fully-connected deep neural networks, and the classical approximation theory of fully-connected networks developed around 30 years ago does not apply. This talk describes an approximation theory of deep CNNs. In particular, we show the universality of a deep CNN, meaning that it can be used to approximate any continuous function to an arbitrary accuracy when the depth of the neural network is large enough. Our quantitative estimate, given tightly in terms of the number of free parameters to be computed, verifies the efficiency of deep CNN sin dealing with large dimensional data. Some related distributed learning algorithms will also be discussed.

Page 106: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

101

Global Convergence of EM

Harrison H. Zhou1,*

, Yihong Wu2

1Yale University, New Haven, CT, USA

2Yale University, New Haven, CT, USA

E-mail: [email protected]

Abstract: For Gaussian mixtures with two symmetric components, we show the global convergence of EM for a random initialization without any separation condition. Key Words: EM; Global Convergence; Gaussian Mixtures

Page 107: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

102

R package for new normality test

Maoyuan Zhou1,*

1 Civil Aviation University of China, 300300, China

E-mail: [email protected]

Abstract: This paper studies a new normality test method proposed by Jin Zhang (2005), realizes the application of this method in R software, and writes it as a practical complete R package: NTest package, which mainly includes three functions that can be used for normality test. Made contrast test with the common normality test function in R, the test results are shown in the figures and rank tables. It can be intuitively seen that the new test method has a higher power than the common normality test method. Further, this research program several R functions based on the above test statistics for two-sampling homogeneity test and multiple samples homogeneity test ( two samples: hoza.test, hozc.test and hozk.test; Various versions: khomoza.test, khomozc.test, khomozk.test). In addition, a comparative experiment was conducted to compare those power with the homogeneity test function commonly used in R. In the case of two samples, the power of the new method is higher than that of K-S test. In the case of multiple samples, the new test method has a higher power of global homogeneity test without considering the assumption of same mean or same variance. Key Words: tests for normality; R Language; nonparametric test; homogeneity test

Page 108: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

103

GD-RDA: A new regularized discriminant analysis for high

dimensional data

Yan Zhou1

1 Shenzhen University, China

E-mail: [email protected]

Abstract: TBA

Page 109: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

104

Matrix Completion for Network Analysis

Ji Zhu1,*

1,*Department of Statistics, University of Michigan, Ann Arbor, MI, 48105, USA

E-mail: [email protected]

Abstract: Matrix completion is an active area of research in itself, and a natural tool to apply to network data, since many real networks are observed incompletely and/or with noise. However, developing matrix completion algorithms for networks requires taking into account the network structure. This talk will discuss two examples of matrix completion used for network tasks. First, we discuss the use of matrix completion for cross-validation or non-parametric bootstrap on network data, a longstanding problem in network analysis. The second example focuses on reconstructing incompletely observed networks, with structured missingness resulting from the egocentric sampling mechanism, where a set of nodes is selected first and then their connections to the entire network are observed. We show that matrix completion can generally be very helpful in solving network problems, as long as the network structure is taken into account. This talk is based on joint work with Elizaveta Levina, Tianxi Li and Yun-Jhong Wu. Key Words: cross-validation; egocentric network; link prediction; matrix completion; network analysis

Page 110: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

105

Quantile double autoregression

Qianqian Zhu 1, Guodong Li

2,*

1 Shanghai University of Finance & Economics, 777 Guoding Rd., Shanghai, 200433,

P.R.China

2 University of Hong Kong, Pokfulam Road, Hong Kong, P.R.China

E-mail: [email protected]

Abstract: Many financial time series have varying structures at different quantile levels, and also exhibit the phenomenon of conditional heteroscedasticity at the same time. In the meanwhile, it is still lack of a time series model to accommodate both of the above features simultaneously. This paper fills the gap by proposing a novel conditional heteroscedastic model, which is called the quantile double autoregression. The strict stationarity of the new model is derived, and a self-weighted conditional quantile estimation is suggested. Two promising properties of the original double autoregressive model are shown to be preserved. Based on the quantile autocorrelation function and self-weighting concept, two portmanteau tests are constructed, and they can be used in conjunction to check the adequacy of fitted conditional quantiles. The finite-sample performance of the proposed inference tools is examined by simulation studies, and the necessity of the new model is further demonstrated by analyzing the S&P500 Index. Key Words: Autoregressive time series model; Conditional heteroscedasticity; Portmanteau test; Quantile model; Strict stationarity.

Page 111: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

106

A Boosting Algorithm for Estimating Generalized

Propensity Scores with Continuous Treatments

Yeying Zhu1,*

, Donna Coffman2, Debashis Ghosh

3

1University of Waterloo, Waterloo, Canada

2Temple University, Philadelphia, USA

3Colorado School of Public Health, Aurora, USA

E-mail: [email protected]

Abstract:In this article, we study the causal inference problem with a continuous treatment variable using propensity score-based methods. For a continuous treatment, the generalized propensity score is defined as the conditional density of the treatment-level given covariates (confounders). The dose–response function is then estimated by inverse probability weighting, where the weights are calculated from the estimated propensity scores. When the dimension of the covariates is large, the traditional nonparametric density estimation suffers from the curse of dimensionality. Some researchers have suggested a two-step estimation procedure by first modeling the mean function. In this study, we suggest a boosting algorithm to estimate the mean function of the treatment given covariates. In boosting, an important tuning parameter is the number of trees to be generated, which essentially determines the trade-off between bias and variance of the causal estimator. We propose a criterion called average absolute correlation coefficient (AACC) to determine the optimal number of trees. Simulation results show that the proposed approach performs better than a simple linear approximation or L2 boosting. The proposed methodology is also illustrated through the Early Dieting in Girls study, which examines the influence of mothers‟

overall weight concern on daughters‟ dieting behavior. Key Words: boosting; distance correlation; dose-response function; generalized propensity score

Page 112: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

107

Safe machine learning for safe genome editing

James Zou1

1 Stanford University, USA

E-mail: [email protected]

Abstract: We analyze CRISPR Cas9 repair outcomes in primary human cells to systematically evaluate DNA repair patterns and investigate on-target DNA damage. Leveraging a large new dataset, we develop a novel machine learning model, CRISPR Repair OUTcome (SPROUT), that accurately predicts the length, probability, and sequence of nucleotide insertions and deletions. SPROUT facilitates optimizing genome editing in therapeutically-important primary human cells.

Page 113: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

108

List of Participants

Name Affiliation E-mail

Le BAO Peking University [email protected]

Tony CAI University of Pennsylvania [email protected]

Tianxi CAI Harvard University [email protected]

Jian CAO Shanghai Jiaotong University [email protected]

Jinyuan CHANG Southwestern University of Finance and Economics [email protected]

Yong CHEN University of Pennsylvania [email protected]

Xin CHEN Southern University of Science and Technology [email protected]

Di-Rong CHEN Beihang University [email protected]

Guanghui CHENG Guangzhou University [email protected]

Linlin DAI Southwestern University of Finance and Economics [email protected]

Lilun DU Hong Kong University of Science and Technology [email protected]

Kun FAN East China Normal University [email protected]

Xiaodan FAN The Chinese University of Hong Kong [email protected]

Bo FU Fudan University [email protected]

Tianyu GUAN Simon Fraser University [email protected]

Xiao GUO University of Science and Technology of China [email protected]

Feng GUO Virginia Polytechnic Institute and State University [email protected]

Zijian GUO Rutgers University [email protected]

Zhi HAN Shenyang Institute of Automation,Chinese Academy of Sciences [email protected]

Ning HAO University of Arizona [email protected]

Haijin HE Shenzhen University [email protected]

Jie HU Xiamen University [email protected]

Yen-Tsung HUANG Academia Sinica [email protected]

Ling HUANG AHI Fintech [email protected]

Fei HUANG dtwave [email protected]

Huilin HUANG Wenzhou University [email protected]

Jongho IM Yonsei University [email protected]

Jiancheng JIANG Nankai University,University of North Carolina at Charlotte [email protected]

Xuejun JIANG Southern University of Science and Technology [email protected]

Dandan JIANG Xi‟an Jiaotong University [email protected]

Michael I. JORDAN University of California at Berkeley [email protected]

Donggyu KIM KAIST [email protected]

Page 114: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

109

Name Affiliation E-mail

Xinbing KONG Nanjing Audit University [email protected]

Hongzhe LEE University of Pennsylvania [email protected]

Yingying LI Hong Kong University of Science and Technology [email protected]

Yunxian LI Yunnan University of Finance and Economics [email protected]

Lexin LI University of California at Berkeley [email protected]

Guodong LI The University of Hong Kong [email protected]

Zhigang LI University of Florida [email protected]

Deli LI Lakehead University [email protected]

Song LI Zhejiang University [email protected]

Tengyuan LIANG University of Chicago [email protected]

Nan LIN Washington University in St. Louis [email protected]

Yuanyuan LIN The Chinese University of Hong Kong [email protected]

Zhixiang LIN The Chinese University of Hong Kong [email protected]

Shaobo LIN Wenzhou University [email protected]

Chengxiu LING Xi'an Jiaotong-Liverpool University [email protected]

Cheng LIU Wuhan University [email protected]

Xin LIU Shanghai University of Finance and Economics [email protected]

Xu LIU Shanghai University of Economics and Finance [email protected]

Catherine LIU The Hong Kong Polytechnic University [email protected]

Xuanzhe LIU Peking University [email protected]

Baisen LIU Southwestern University of Finance and Economics [email protected]

Xi LIU Synfuels China [email protected]

Weidong LIU Shanghai Jiao Tong University [email protected]

Qi LONG University of Pennsylvania [email protected]

Zhiping LU East China Normal University [email protected]

Shikai LUO Didi Chuxing [email protected]

Xiangyu LUO Renmin University of China [email protected]

Yanyuan MA Pennsylvania State University [email protected]

Xiaojun MAO Fudan University [email protected]

Song MEI Stanford University [email protected]

Yue NIU University of Arizona [email protected]

Xiaoyue NIU Pennsylvania State University [email protected]

Yinghao PAN University of North Carolina at Charlotte [email protected]

Andre PYTHON University of Oxford [email protected]

Lixuan QIN Memorial Sloan Kettering Cancer Center [email protected]

Zhao REN University of Pittsburgh [email protected]

Page 115: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

110

Name Affiliation E-mail

Qi-man SHAO Southern University of Science and Technology [email protected]

Wei SHI Jinan University [email protected]

Xu SHI Harvard University [email protected]

Zhijie SONG Hangzhou Chenhao Company [email protected]

Xinyuan Song Chinese University of Hong Kong [email protected]

Will Wei SUN University of Miami [email protected]

Tiejun TONG The Hong Kong Baptist University [email protected]

Miaoyan WANG University of Wisconsin-Madison [email protected]

Man WANG Donghua University [email protected]

Cheng WANG Shanghai Jiao Tong University [email protected]

Gui WANG Zhejiang University City College [email protected]

Tao WANG Shanghai Jiao Tong University [email protected]

Tianying WANG Columbia University [email protected]

Xuan WANG Zhejiang University [email protected]

Yao WANG Xi'an Jiaotong University [email protected]

Shaoli WANG Shanghai University of Finance and Economics [email protected]

Yang WANG Hong Kong University of Science and Technology [email protected]

Qinwen WANG Fudan University [email protected]

Yazhen WANG University of Wisconsin - Madison [email protected]

Yingying WEI The Chinese University of Hong Kong [email protected]

Hu WEI Yingying Group, Inc. [email protected]

Honglei WEN Wenzhou University [email protected]

Yihong WU Yale University [email protected]

Zongqi XIA University of Pittsburg [email protected]

Yin XIA Fudan University [email protected]

Han XIAO Rutgers University [email protected]

Jichun XIE Duke University [email protected]

Mengyu XU University of Central Florida [email protected]

Huaping XU Zhejiang Chinese Medical University [email protected]

Kaijie XUE Nankai University [email protected]

Yuan YAO Hong Kong University of Science and Technology [email protected]

Kaixian YU Didi Chuxing [email protected]

Zhou YU East China Normal University [email protected]

Sheng YU Tsinghua University [email protected]

Ming YUAN Columbia Univeristy [email protected]

Jinshan ZENG Jiangxi Normal University [email protected]

Nan ZHANG East China Normal University [email protected]

Page 116: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

111

Name Affiliation E-mail

Anru ZHANG University of Wisconsin-Madison [email protected]

Zhengjun ZHANG University of Wisconsin-Madison [email protected]

Caiya ZHANG Zhejiang University City College [email protected]

Chi ZHANG Infervision Advanced Research [email protected]

Chuanhai ZHANG Zhongnan University of Economics and Law [email protected]

Cun-Hui ZHANG Rutgers University [email protected]

Xiaochen ZHANG Xiamen University [email protected]

Jia Zhang Southwestern University of Finance and Economics [email protected]

Jun ZHANG Alibaba Group [email protected]

Feida Zhang University of Queensland [email protected]

Heping ZHANG Yale University [email protected]

SihaiDave ZHAO University of Illinois at Urbana-Champaign [email protected]

Xinghua ZHENG Hong Kong University of Science and Technology [email protected]

Zemin ZHENG University of Science and Technology of China [email protected]

Wei ZHONG Xiamen University [email protected]

Harrison ZHOU Yale University [email protected]

Maoyuan ZHOU Xiamen University [email protected]

Yan ZHOU Shenzhen University [email protected]

Ding-Xuan ZHOU School of Data Science & Department of Mathematics,City University of Hong Kong [email protected]

Qianqian ZHU Shanghai University of Finance and Economics [email protected]

Ji ZHU University of Michigan [email protected]

Yeying ZHU University of Waterloo, Canada [email protected]

James ZOU Stanford University [email protected]

Page 117: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

112

List of Participants (ZJU)

Name E-mail

Younes BOKTAYA [email protected]

Xi CHEN [email protected]

Lei CHEN [email protected]

Silu CHEN [email protected]

Yang CHEN [email protected]

Lang CHENG [email protected]

Shunjie DONG [email protected]

Zhetong DONG [email protected]

Lingjie DU [email protected]

Xuansu FANG [email protected]

Chao FENG [email protected]

Yongchang FU [email protected]

Mingyang GONG [email protected]

Tao GONG [email protected]

Ming GUO [email protected]

Xu HE [email protected]

Yongxing HE [email protected]

Chuanfeng HU [email protected]

Huan HUANG [email protected]

Meixiang HUANG [email protected]

Shenwei HUANG [email protected]

Yi'an HUANG [email protected]

Hangjin JIANG [email protected]

Yuliang JIANG [email protected]

Xiaoying JIANG [email protected]

Meifang LAN [email protected]

Yixin LI [email protected]

Jiaqi LI [email protected]

Shuangbo LI [email protected]

Wei LI [email protected]

Yangang LI [email protected]

Page 118: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

113

Name E-mail

Yuning LI [email protected]

Zejian LI [email protected]

Huiping LI [email protected]

Junhong LIN [email protected]

Zhengyan LIN [email protected]

Xin LIN [email protected]

Peilin LIU [email protected]

Rong LIU [email protected]

Weiming LIU [email protected]

Zhunzhun LIU [email protected]

Wei LUO [email protected]

Tianyu MA [email protected]

Huiling MAO [email protected]

Xiaoye MIAO [email protected]

Jingjing PAN [email protected]

Tianxiao PANG [email protected]

Xinyue QIAN [email protected]

Qinghua RAN [email protected]

Jingwen REN [email protected]

Wuyue SHANGGUAN [email protected]

Kaili SONG [email protected]

Zhonggen SU [email protected]

Zuoqi TANG [email protected]

Chen TIAN [email protected]

Haochuan WANG [email protected]

Tiantian WANG [email protected]

Xiaoyu WANG [email protected]

Jia WEI [email protected]

Jiwei WEN [email protected]

Jun WEN [email protected]

Xiaoyu WU [email protected]

Daiqing XI [email protected]

Junpeng XIA [email protected]

Page 119: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

114

Name E-mail

Yu XIA [email protected]

Renjun XU [email protected]

Hang XU [email protected]

Chenkai XU [email protected]

Jiapan XU [email protected]

Guan'ao YAN [email protected]

Qing YANG [email protected]

Mengting YAO [email protected]

Jiangsheng YI [email protected]

Jianwei YIN [email protected]

Mufang YING [email protected]

Yan YU [email protected]

Lixin ZHANG [email protected]

Rongmao ZHANG [email protected]

Guangyi ZHANG [email protected]

Hao ZHANG [email protected]

Hongxin ZHANG [email protected]

Yu ZHANG [email protected]

Jiaming ZHU [email protected]

Luxi ZOU [email protected]

Page 120: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

115

Maps

Page 121: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

116

Page 122: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

117

Page 123: Contentscfm.zjuyh.com/.../20190529/20190529141709_8019.pdf · Advances in Statistical Learning Real-world Application of AI & Service Computing Time series analysis Advanced Methods

118