Development of a knowledge based hybrid neural network (KBHNN) for studying the effect of diafiltration during ultrafiltration of whey

Desalination 273 (2011) 168–178

Contents lists available at ScienceDirect

Desalination

j ourna l homepage: www.e lsev ie r.com/ locate /desa l

Development of a knowledge based hybrid neural network (KBHNN) for studying theeffect of diafiltration during ultrafiltration of whey

Dwaipayan Sen, Anirban Roy 1, Apratim Bhattacharya 2, Debadrito Banerjee, Chiranjib Bhattacharjee ⁎Department of Chemical Engineering, Jadavpur University, Kolkata-700032, WB, India

⁎ Corresponding author. Tel.: +91 98364 02118; fax:E-mail addresses: [email protected], cbhattach

(C. Bhattacharjee).1 Present Address: 2900 Kingstown Road, APT A 111,2 Present Address: Department of Chemical Engineer

hem, PA 18015, USA.

0011-9164/$ – see front matter © 2010 Elsevier B.V. Adoi:10.1016/j.desal.2010.10.038

a b s t r a c t
a r t i c l e i n f o
Article history:Received 21 August 2010Received in revised form 15 October 2010Accepted 18 October 2010Available online 20 November 2010

Keywords:UltrafiltrationDiafiltrationKnowledge based hybrid neural networkFirst principle modelLevenberg–Marqurt algorithm

The membrane surface dynamics is very difficult to predict and can be roughly estimated by the availablemodels but a true depiction is always difficult since the magnitude and direction of driving forces change as afunction of time. The present study is an effort to address the issue, so that the combinatorial approach ofdeterministic and stochastic modelling might present a better understanding of membrane dynamics. Theeffect of diafiltration has also been incorporated to investigate the effects it has on themembrane. A stochasticmodel developed by a knowledge based hybrid neural network (KBHNN) was trained using the Levenberg–Marqurt algorithm where the film layer model was used as the deterministic layer, called the first principlemodel (FPM). Present work employs two different types of KBHNN architecture with an effort to understandthe suitability and applicability of the hybrid network in case of predictions for an ultrafiltration (UF) process.In one sort of architecture neural part was in series with the FPM and in the other one it was in parallel withthe FPM. The high correlation coefficient (R2) value portrays the correctness and preciseness of theunderlining assumptions and establishes the validity of the developed network.

+91 33 2414 [email protected]

Kingston,RI 02881, USA.ing, Lehigh University, Bethle-

ll rights reserved.

© 2010 Elsevier B.V. All rights reserved.

1. Introduction

Membrane separation processes have gained considerable atten-tion in the chemical process industry in the last few decades. Apartfrom their various advantages over centrifugation, evaporation, etc., aprimal drawback of these processes is the decrease in the permeatefluxwith time due to increased concentration of the solute particles inthe immediate vicinity of the membrane surface, resulting in anincrease in the local osmotic pressure [1]. This increase in the osmoticpressure resists the passage of the solvent and consequently themicrosolutes through the membrane. Thus the recovery of the desiredcomponents using membrane separation is substantially affected. Toalleviate these problems different types of high-sheared membranemodules are in use, developing high shear stress on the membranesurface. Elevated shear stresses recompense the negative concentra-tion gradient built up on the membrane surface hence enhancing thepurity of the component. But employing high shear arrangementwithin the membrane module increases the cost of the membraneseparation process and also requires rigorous maintenance effort.

Diafiltration is another art of work in membrane separationtechnology where a solvent, mainly water, is combined with feed to

the membrane module [2]. In this process fresh water is addedcontinuously or batchwise to the retentate side, resulting in thedilution of the retentate portion and thus hereby alleviates theproblems associated with the built-up concentration boundary layeron the membrane surface [3]. It has been reported that diafiltrationcombined with membrane separation could be an efficient tool toenhance the purity of the retained stream [4] without any mechanicalenhancement of the membrane module. Especially the use ofdiafiltration is now a state-of-the-art technique in food and beverage,biotechnology and pharmaceutical industries. Arabelle et al. [5] haveshown that diafiltration in conjunction with a cross-flow modulesignificantly enhances the purity in the separation of proteins byemploying a continuous washing of the retained solutes on themembrane with the increasing dilution of the feed. In their study theyhave employed ultrafiltration (UF) to separate α-lactalbumin fromcasein whey applying different modes of diafiltration in conjunctionwith UF followed by a mathematical formulation of the process.

Membrane separation processes like UF, the development of amechanistic model are quite time-consuming and cumbersome. Designof membrane modules facilitating effective monitoring and controlimparts a major challenge for UF systems. In such cases, however, thedetailed understandingof thedynamics servesmore of a pedantic than apractical purpose. Inmost of themodels developed for the prediction offlux during UF either with or without diafiltration, the interaction of thesolutes with the polymeric membrane is neglected. Other veryimportant parameters such as pore size distribution, contribution ofthe unclogged pores, etc. were also neglected, which somewhatcompromises the accuracy of a comprehensive mechanistic model.

http://dx.doi.org/10.1016/j.desal.2010.10.038

mailto:[email protected]

mailto:[email protected]

http://dx.doi.org/10.1016/j.desal.2010.10.038

http://www.sciencedirect.com/science/journal/00119164

Fig. 1. Schematic diagram of the cross-flow membrane module.

169D. Sen et al. / Desalination 273 (2011) 168–178

Moreover, attempts to explain the limiting flux with osmotic pressuremodels are also reported [6]. However, because the osmotic pressure isnegligible in many UF processes, the osmotic pressure model cannotprovide a generalized explanation for the limiting flux in this case [7,8].Problems on predicting the outcome of membrane separation processbecomes more difficult with this type of classical model, when the feedcomponents have a tendency to interact with the membrane material.Especially this is more pronounced when the feed consists of proteins,carbohydrates and/or any bio-components. Amphiphilic nature ofbiomolecular species makes them more surface active and thus theyare more readily adsorbed on the membrane surface [9]. Srebnik [9]attempted to develop a predictivemodel applying themeanfield theoryafter considering the fouling that could occur because of protein–membrane interaction. However in theirmodel it was assumed that thefouling species are homopolymeric in nature,whichmight not be a goodapproximation in actual because of protein or any biomoleculedenaturation. Thus, the generality of the model is compromised withthe rigorously taken assumptions. Lack of these generalization ofmathematicalmodels insist researchers to rely upon the artificial neuralnetwork (ANN) approach, where the prediction is based on the real lifedataset fed to the network without considering the physical nature ofthe system.

The ANN is basically a training based computational paradigmwith initial inspirations rooted in biology and physics [10]. Thecomputational framework for an ANN entails a large number of inter-connected processing elements or nodes, called neurons, arranged indifferent layers. The true computational power on the other handderives from a combination of adaptable interconnections, called theweights, a layered topology, and nonlinearities associated with theneuron computations. Bhattacharjee et al. [11] had made a rigorouscomparative study on different neural network topologies andeffectively predict the permeate flux in a rotating disc membranemodule with an absolute average deviation (AAD) of 6.9%. Challengesassociated with these types of black-box systems are the lack ofunderstanding of the nonlinear physics of the process. This explainsthe inability of these processes to enhance the process output byadjusting the inherent process parameters. Curcio et al. [12] observedthat if the ANN is used as a black-box type system, it would be acomplicated system [13] having a large number of connections andparameters, which might have a tendency to make the networkoverburdened and thus generate erroneous predictions. To alleviatethese problems incorporation of an existing classical model describingthe process, normally called the first principle model (FPM), into thenormal ANN architecture can lead to a reduction in the dimensionalityof the input vector [14,15] fed to the simple neural network. Thisapproach is called knowledge based hybrid neural network (KBHNN)and was proved as a useful alternative to the simple ANN.

The objective of this article is two-fold. Firstly, it poses a study onthe impact of diafiltration towards the fractionation of main proteinconstituents, Ig, BSA, α-La and β-Lg, present in whey, using a cross-flow ultrafiltration membrane module. Secondly, it proposes a hybridmodel to understand the behaviour of cross-flow ultrafiltration usedin conjunctionwith the diafiltrationmode used during the recovery ofproteins. In this aspect KBHNN was adopted to design a predictivehybrid model where the FPM based on the film theory was used inconjunction to the neural network system with an accuracy level ofaround 99%.

2. Materials and methods

2.1. Materials

For the present study casein whey was obtained from a localconfectionery industry. Deionisedwater was obtained from the Arium611DI ultrapure water system. The feed to Arium 611DI was from theArium 61315 reverse osmosis (RO) system (make: Sartorius A.G.,

Gottingen, Germany). Coomassie Brilliant Blue (G-250) for Bradfordprotein assay, made by Pierce Biotechnology, Inc., Rockford, IL61105,USA, obtained through Hysel India Pvt. Ltd., NewDelhi, India was usedto analyse the total protein content.

2.2. Membranes and module

The study was carried out with an asymmetric polyether sulphone(PES) membrane casted in a cross-flow membrane module obtainedfrom Vivaflow (Fig. 1), equipped with a peristaltic pump for pumpingthe feed to the membrane module. A complete schematic representa-tion of theprocessflowsheet is shown in Fig. 2. PESmembranes (with aneffective surface area of 0.02 m2) of 100 kgmol−1, 50 kgmol−1, and5 kgmol−1 molecular weight cut-off (MWCO) were used subsequently.Water permeability for each of themembraneswas calculated using thefollowing equation [16] and is given in Table 1.

JW = LP;WΔP ð1Þ

where LP,W is the water permeability of membrane, Jw is water fluxand ΔP is TMP.

Each step of the membrane separation process was carried out inconjunction with a two staged discontinuous diafiltration (DD)process. The membrane was thoroughly washed using 2.5×10−4 m3

of 0.038 kg m−3 NaOCl in 2.5×10−4 m3 of 20 kg m−3 NaOH after eachrun keeping themodule in total recirculation at aflow rate of 8.33×10−8

to 1.67×10−7 m3 s−1 for 1800–2400 s. The water flux of the washedmembrane was taken to ensure 90% regaining of the flux taken with thefresh membrane.

2.3. Pretreatment of casein whey

In order to prevent any possibility of membrane fouling, initiallythe raw feed was centrifuged in a high speed research centrifuge witha speed of 1046.67 rad s−1 for 1200 s in order to separate suspendedsolids. After the centrifugation, the supernatant was subjected to0.45×10−6 m microfiltration (MF) using all glass vacuum filtrationunit fit with oil free portable vacuum pump, with a symmetric PESmembrane (0.047 m dia, pore size 0.45×10−6 m), being used as filtermedia.

2.4. Membrane compaction and water run

Prior to any experimental run, the cross-flow membrane wassubjected to water compaction at a pressure of 0.28 MPa, higher thanthe operating pressure, until a steady water flux was obtained to

Fig. 2. Schematic representation of the experimental methods.

170 D. Sen et al. / Desalination 273 (2011) 168–178

ensure no changes in the membrane pore diameter during theexperimental process because of applied pressure. The membranehydraulic resistance was calculated using the following equation.

Rm =ΔP

μw Jwð2Þ

where Rm is the membrane hydraulic resistance, μw is the viscosity ofwater, Jw is water flux and ΔP is TMP.

2.5. Methods

5×10−4 m3 of permeate collected from the MF unit was fed to a100 kg mol−1 PES membrane in order to achieve possible separationof Immonoglobulins (IgG, IgA, and IgM) (Molecular weight: 150 kgmol−1–900 kg mol−1) from the other low molecular weight compo-nents present in casein whey. The process was carried out with twostages of DD maintaining a volume reduction ratio (ζ) of 0.4 at apressure of 0.15 MPa. The volume reduction ratio is defined as theratio of the volume of retentate after each diafiltration/filtration stepto the initial feed volume. Therefore at the end of each diafiltrationaccompanied filtration with 100 kg mol−1 retentate volume was2×10−4 m3. For enunciating next stage diafiltration 3×10−4 m3

water was added as a make-up water to make the feed volume5×10−4 m3. Thus finally after completion of two stages with a 100 kg

Table 1Water permeability (m s−1 Pa−1) values for the 100 kg mol−1, 50 kg mol−1 and 5 kgmol−1 UF membranes.

Membrane MWCO (kg mol−1) Permeability (m s−1 Pa−1) × 1010

100 350 25 0.5

mol−1 membrane total 9×10−4 m3 of permeate was collected whichwas again used as a feed to 50 kg mol−1 membrane. A 50 kg mol−1

MWCO PES cross-flow membrane was used to separate BSA, whoseaverage molecular weight is 66 kg mol−1. This membrane step wassubsequently followed by two stages of DD with a volume reductionratio (ζ) of 0.4. Similarly as explained above the total collectedpermeate volume from 50 kg mol−1 was 16.2×10−4 m3 and contain-ing mainly β-Lg (average molecular weight 18.3 kg mol−1), α-La(average molecular weight 14.2 kg mol−1), lactose (molecular weight0.3423 kg mol−1) along with some other lower molecular weightsubstances. This was again fed to a 5 kg mol−1 membrane followed bytwo stages of DD after maintaining the same volume reduction ratio(ζ) of 0.4 as before. In all of the runs permeate flux was noted with acertain interval of time. Fig. 2 shows the schematic representation ofthe process flowsheet.

2.6. Analysis

A. Total Protein Concentration (TPC): TPC was measured using aHitachi dual beam UV–visible spectrophotometer by the principleof Bradford protein assay at 590 nm. The assay is based on theobservation that themaximum absorbance for an acidic solution ofCoomassie Brilliant Blue G-250 shifts from 465 nm to 595 nmwhen binding to protein occurs [17].

B. Individual Protein Concentration: Protein concentrations weremeasured for each sample using HPLC (Series 200; Makers: PerkinElmer®, UV Detector). Samples were analysed using MerckLiChroCART LiChrospher C18 end-capped (5 × 10−6 m,0.004×0.25 m) (Germany). Eluent preparation was done by themethod suggested by Ballin [18]. Flow rate was maintained at1.67 m−3 s−1 at 308 K.

C. Lactose measurement: Lactose concentration was measured usingHPLC (Series 200;Makers: Perkin Elmer®, RI detector). The samplewas analysed using Spheri 5 Amino column (0.005×0.22 m)

image of Fig.�2

Fig. 3. Schematic diagram of the neural network architecture (Winp,hidd: Weights for theinterconnection between input and hidden layers' neurons; Whidd,out: Weights forthe interconnection between hidden and output layers' neurons; bhidd: Biases for thehidden layer neurons; bout: Biases for the output layer neurons).


(Perkin Elmer, India). Acetonitrile andwater in a ratio of 75:25wasused as a mobile phase with a flow rate of 1.67 m−3 s−1 at acolumn temperature of 298 K.

3. Development of the FPM

Balancing mass on the microsolutes for batch diafiltration thefollowing equation can be formulated,

−d Vm tð ÞCBulk;Micro;m tð Þh i

dt= AJm tð ÞC0;MicroPm tð Þ ð3Þ

where Jm tð Þ = − 1AdVP;m tð Þ

dt is the permeate flux, A is the effectivemembrane area and Pm tð Þ = CP;Micro;m tð Þ

C0;Microis the permeation coefficient of

the microsolute.Again making an overall balance of the process it was found that,

Vm tð Þ = V0−VP;m tð Þ: ð4Þ

Now applying the following boundary condition, at VP,m(∞)=V 0 ( 1 −ζ ) + ( m− 1 ) V w , CBulk;Micro;m ∞ð Þ = V0− 1−Rð ÞfV0ζ +½m−1ð ÞVwg�C0;Micro in Eqs. (3) and (4), the following Eq. (5) wasfound wherefrom the microsolute concentration can be evaluated.

CBulk;Micro;m tð Þ = V0 + Pm tð Þð Þ− 1−Rð Þ V0ζ + m−1ð ÞVwf g½ �C0;MicroV0ζ + m−1ð ÞVw

V0−VP;m tð Þ

" #

ð5Þ

An assumption made was that the permeation of the solutesthrough the membrane was limited because of the concentrationpolarisation developed on the membrane surface alone. Also it wasassumed that the built-up of the concentration boundary layer waslimited with time as the feed introduced to the membrane wascontinuously diluted. Based on this pseudo-steady state assumption,in the subsequent formulation of the macrosolute concentration onthe retentate side, the film layer model (Eq. (6)) was used to predictsolute concentration on the membrane wall. Applying this model forthe microsolutes, concentration polarisation coefficient (ξ) wasestimated using Eq. (5) through (7). Again using the value of thiscoefficient and Eq. (8) the concentration of the macrosolute on theretentate side was estimated (Eq. (9)).

Jm tð Þ = ki;Micro;avglnCw;Micro;m tð Þ−CP;Micro;m tð ÞC0;Micro tð Þ−CP;Micro;m tð Þ

!ð6Þ

Cw;Micro;m tð Þ = ξCBulk;Micro;m tð Þ ð7Þ

Jm tð Þ = ki;Macro;avglnCw;Macro;m tð Þ−CP;Macro;m tð ÞC0;Macro tð Þ−CP;Macro;m tð Þ

!ð8Þ

CBulk;Macro;m tð Þ = Cw;Macro;m tð Þξ

ð9Þ

As the feed to the membrane module was continuously diluted,therefore the boundary layer thickness will be much less than theheight of the channel. As all the mass balance equations are madewithin the boundary layer therefore it was assumed that themembrane separation process here is an open channel flow over aflat plate. Therefore the values of ki,Micro, avg and ki,Macro, avg wereevaluated using the following equation [19],

Sh = 0:664 Reð Þ0:5 Scð Þ0:33: ð10Þ

Thus the percentage purity can be evaluated using Eq. (11),

Percentage Purity ¼ 100CBulk;Macro;m tð ÞCBulk;Micro;m tð Þ + CBulk;Macro;m tð Þ : ð11Þ

4. Design of KBHNN

4.1. Simple neural network architecture (black-box model)

A main advantage that is associated with the simple ANN black-boxmodel is its inherent feature of approximating a different physicalprocess with an identical computation paradigm. Thus it reduces lot ofefforts that are spent to identify the process parameters during theformulation of any deterministic model [20]. Fig. 3 shows a schematicrepresentation of the ANN architecture. The number of nodes/neuronsin the input layer and output layer is mainly fixed by the experimentalparametric constraints, whereas hidden layer nodes, which may betermed as the central processing unit of the ANN in another way, aremainly fixed by the ANN user. As the simple architecture of the ANN isonly supervised by the experimental data-points therefore perfor-mance of the network is very much dependent on the suppliedinformation to the network during training. Among several advan-tages, this is one of the primal disadvantages that are realizedwith theblack-box ANN model because of the lack of proper understanding ofthe process. To alleviate this issue with the black-box modelnowadays researchers tried to include the normal process dynamics'model in the ANN architecture in order to identify the correctness ofthe training data and hence an utmost accuracy of the modelprediction. In the next two subsequent sections discussions aremainly confined to the approaches where the normal classical processmodel is used in conjunction with the neural network model.

4.2. Parallel hybrid neural network (PHNN) architecture

In a parallel approach the inaccuracy in the predicted value fromthe FPMwill adequately be minimised by the addition of the residualsfrom the neural network.

image of Fig.�3


Eqs. (12) and (13) present the form of the model parameters thatare used in the FPM to predict the value of the response variable [14],

YFPM = ϕ X;ηð Þ ð12Þ

η = ψ að Þ ð13Þ

where YFPM is the variable predicted using the FPM, X is the inputvariable to the model, η are the constant variables used in the modeland a are the assumptions based on which η's are taken constants.Sometimes these assumptions regarding constants in the FPM lead toerroneous results and thus inaccurate predictions. This problem ofmaking assumptions on the constants can be overcome by recalculat-ing the values of those constants with new conditions. But in manycases this can be a problem because of the limited knowledge on therelations between the constants and new conditions. HNN can solvethis scenario without having any prior knowledge of the relationaldependencies. In the parallel network approach the differencebetween the predicted value from the FPM (YFPM) and the experi-mental value (YExp) was calculated which must be the target residuals(R) for the neural network system and the network was trained using

Fig. 4. Parallel hybrid neur

the Levenberg–Marqurt algorithm [21]. Fig. 4 presents the generalscheme of the parallel HNN.

R = YExp−YFPM ð14Þ

4.3. Series hybrid neural network (SHNN) architecture

Fig. 5 shows the diagrammatic representation of the series approachof the hybrid neural network. In this approach a process variable which isdifficult tomeasurewas estimated from theneural network and fed to theFPMas an input to it. Thus the output coming out of the FPMwas checkedwith the experimental value for convergence. In the present work, thevalue of themacro- andmicrosolute concentration on themembranewasevaluated from the network and this valuewas fed to the FPM to calculatethe percentage purity of whey protein. So, with the series approachassumptions were avoided in the ultimate formulation of the HNN.

The Levenberg–Marqurt algorithm [21] was used to train the feedforward network used in this present study. Figs. 4 and 5 show thealgorithm flowsheet used to train the PHNN [21] and SHNN [22]respectively. All the simulation codes were developed using the

al network algorithm.

image of Fig.�4

Fig. 5. Series hybrid neural network model.


MATLAB® Version 7.0 (R14) software package. Selection of the inputvariables fed to the neural network to perform this task could be amost crucial step in designing the HNN. Mainly the variables whichare not used in the FPM will be used in the neural network. Butsometimes the variables used in the FPM can also be used in theneural network part. This is done when the information contained inthese variables used in the FPM are inadequate in modelling theoriginal process. Thus with proper selection of variables, redundancyof the inputs to the network can be avoided and the execution time forthe network training can be significantly minimised.

5. Results and discussions

In thefirst part of this section, analyses are limited to the experimentalobservationswhereas in the subsequent parts analyses are focussed in themodelling of the process. In this section a comparative study has beenmade on the predictive abilities of the series hybrid neural network(SHNN) and parallel hybrid neural network (PHNN).

5.1. Variation of permeate flux and percentage purity of the whey proteinwith time

Figs. 6 through 8 show the variation of normalised permeate fluxwith time maintaining a volume reduction ratio (ζ) of 0.4 at a TMP0.15 MPa for each of the membranes (with different MWCOs). Theprocess was carried out in conjunction with two staged diafiltration.The figures show that with the subsequent stages of diafiltration,percentage drop in the steady state permeate flux from the initial fluxdecreases. This was manifested by the continuous washing effectimparted on the membrane surface due to dilution. These confirmincreased passage of themicrosolute through themembranemaking asubstantial increase in the purity of the macrosolutes retained by themembrane. Interestingly, the dilution effect was getting morepronounced, dominating over the MWCO of the membrane, showingan apparent increase of 30% in the permeate flux in 50 kg mol−1 over100 kg mol−1, and 11% increase in 5 kg mol−1 over 50 kg mol−1

MWCO membranes. This attributes to an increased chance of thereusability of the membrane and a considerable reduction in the

image of Fig.�5

Fig. 8. Permeate flux versus time plot for the 5 kg mol−1 membrane with diafiltrationmode at ΔP=0.15 MPa and ζ=0.4 (○: UF of the feed to the membrane unit; □: 1ststage of DD where a certain make-up volume of water was added to the retentate afterfiltration of the feed; ◊: 2nd stage of DD where again a make-up volume of water wasadded to the retentate after filtration for the 1st stage DD feed).

Fig. 6. Permeate flux versus time plot for the 100 kgmol−1 membrane with diafiltrationmode at ΔP=0.15 MPa and ζ=0.4 (○: UF of the feed to the membrane unit; □: 1ststage of DD where a certain make-up volume of water was added to the retentate afterfiltration of the feed; ◊: 2nd stage of DD where again a make-up volume of water wasadded to the retentate after filtration for the 1st stage DD feed).


membrane washing cost in addition to an increase in percentagepurity of the components. Figs. 9 through 11 present a variation of thepercentage purity of the whey protein with time and diafiltrationstages. It can be concluded from the figures that two stages ofdiafiltration were quite effective to achieve more than 90% purity ofthe whey protein. In the first stage of diafiltration for the 100 kg mol−1

and50 kgmol−1membranes, there is a substantial increase in thepurityinitially, but in the subsequent time steps the purity curves becomealmost flattened explaining the maximum infiltration of microsolutesthrough the membrane. This can be attributed to the diminishing effectof the film or concentration gradient developed on the membranesurface with an effective decrease in the concentration of themicrosolute on the retentate side. In the 5 kg mol−1 membrane, thiseffect is substantial because ofmaximumdilution of the feed introducedto the membrane.

Fig. 7. Permeate flux versus time plot for the 50 kg mol−1 membrane with diafiltrationmode at ΔP=0.15 MPa and ζ=0.4 (○: UF of the feed to the membrane unit; □: 1ststage of DD where a certain make-up volume of water was added to the retentate afterfiltration of the feed; ◊: 2nd stage of DD where again a make-up volume of water wasadded to the retentate after filtration for the 1st stage DD feed).

The following discussion below is confined on the predictioncapability of the combinatorial usage of the FPM with the neuralnetwork in the form of the PHNN and SHNN respectively giving anidea of the usability of these models for the evaluation of the enrichedpercentage purity observed in the cross-flow UF in conjunction withdiafiltration.

5.2. Percentage purity prediction using the FPM as a sole model, PHNNand SHNN

A considerable effort was made in the selection of the variablesinput to the neural network during the present work. Table 2 presentsthe list of the input variables used in the FPM and the ones used in theneural network system. From the table it is worth noting that somevariables which are not used in the FPM at all are used as inputs to the

Fig. 9. Variation of percentage purity of Igwith time for UF of whey in the 100 kg mol−1

PES membrane with three stage diafiltration at ΔP=0.15 MPa and ζ=0.4 (○: UF of thefeed to the membrane unit; □: 1st stage of DD where a certain make-up volume ofwater was added to the retentate after filtration of the feed; ◊: 2nd stage of DD whereagain a make-up volume of water was added to the retentate after filtration for the 1ststage DD feed) (Inset: Variation of percentage purity of Ig with the number of stages).

image of Fig.�7

image of Fig.�8

image of Fig.�9

image of Fig.�6

Table 2Variables that are used in the hybrid modules in the FPM and in the neural network.

Sl.no.

Variables Parallel HNNa Series HNN

FPM inKBHNNb

NN inKBHNN

FPM inKBHNN

NN inKBHNN

1 Time, t, (s) NU U NU U2 MWCO of the membrane, (kg mol−1) NU U NU U4 Initial feed volume, V0, (m3) U NU U U5 Volume concentration ratio, ζ U NU U NU6 Make-up water volume, Vw, (m3) U NU U NU7 Rejection coefficient, R U NU U NU8 Diafiltration stage number, m U NU U NU9 Volume of the permeate, VP,m(t) , (m3) U NU U NU10 Initial concentration of the microsolute

in the feed, C0,Micro, (kg m−3)U NU U NU

11 Initial feed concentration, C0, (kg m−3) NU NU NU U12 Permeate concentration of the

microsolute in the permeate,CP,Micro,m(t), (kg m−3)

U U U NU

13 Effective area of the membrane, A,(m2)

U NU U NU

14 Bulk concentration of themacrosolute, C0,Macro, (kg m−3)

U NU U NU

aHybrid Neural Network; b Knowledge Based Hybrid Neural Network; NU: Not used inthe model; and U: Used in the model.

Fig. 10. Variation of percentage purity of BSA with time for UF of whey in the 50 kgmol−1 PESmembrane with three stage diafiltration atΔP=0.15 MPa and ζ=0.4 (○: UFof the feed to the membrane unit; □: 1st stage of DD where a certain make-up volumeof water was added to the retentate after filtration of the feed; ◊: 2nd stage of DDwhereagain a make-up volume of water was added to the retentate after filtration for the 1ststage DD feed) (Inset: Variation of percentage purity of BSA with the number of stages).


neural network. Also, some variables have been used both in the FPMand the neural network system because of their inadequacy inexplaining the output from the FPM alone. Figs. 12 through 14 show acomparative view on the squared error between model predicted andexperimental percentage purity value. Formulation of the FPM wasbased on the concentration polarisation model, which was basicallyused to calculate the bulk concentration of the microsolutes andmacrosolutes on the retentate side. Now with the 100 kg mol−1

membrane (Fig. 12) when the feed was first introduced to themembrane module there is no built-up of the solute concentration inthe vicinity of the membrane. Hence no negative concentrationgradient could occur during that period, which limits the usage of theconcentration polarisationmodel or film layermodel. But at the end of

Fig. 11. Variation of percentage purity of α-La and β-Lg with time for UF of whey in the50 kg mol−1 PES membrane with three stage diafiltration at ΔP=0.15 MPa and ζ=0.4(○: UF of the feed to the membrane unit;□: 1st stage of the DD where a certain make-up volume of water was added to the retentate after filtration of the feed; ◊: 2nd stageof DD where again a make-up volume of water was added to the retentate afterfiltration for the 1st stage DD feed) (Inset: Variation of percentage purity of α-La andβ-Lg with the number of stages).

the first filtration step with the 100 kg mol−1 membrane the solutedeposition took place. Thus the polarisation effect became predom-inant and hence the concentration polarisation model became validfor this regime of filtration. In the subsequent stages of filtration withthe 100 kg mol−1 and 50 kg mol−1 membranes for the initial periodthe squared error value was least because of deposition of the soluteson the surface due to the previous filtration step. But as the watercontent in the feed was increased due to addition of the make-upwater, the washing or sweeping effect became pronounced in thecross-flow filtration module. Thus the deposited solutes on themembrane were taken away and thus lessen the validity of theconcentration polarisation model in the FPM. Again as the waterpermeation was more than the solute permeation, the retentate sidebecame enriched with the solute concentration with time and again anegative concentration gradient was built up on the membrane.

Fig. 12. Variation of the squared error between predicted %purity and experimental %purity with time at different stages of diafiltration for the 100 kg mol−1 UF membraneat ΔP=0.15 MPa and ζ=0.4 (—: Squared error for the FPM; ——: Squared error for thePHNN; and ······: Squared error for the PHNN).

image of Fig.�11

image of Fig.�12

image of Fig.�10

Fig. 13. Variation of the squared error between predicted %purity and experimental %purity with time at different stages of diafiltration for the 50 kg mol−1 UF membrane atΔP=0.15 MPa and ζ=0.4 (—: Squared error for the FPM; ——: Squared error for thePHNN; and ······: Squared error for the PHNN).


Therefore it can be seen from the figures apart from the feed filtrationwith the 100 kgmol−1 membrane all the curves go through amaxima.

From the figures it is observed that in the first stage of diafiltrationwith 100 kg mol−1 and 50 kg mol−1 the squared error term increasesat the end of the filtration step. Whereas it could be expected that atthe end, the solute concentration gets increased because of morewater permeation and thus the prediction would be more accurate.The deviation could be because of ignoring the fouling due toadsorption of the protein/organic molecules on the PES membrane[23,24] and thus the prediction using the concentration polarisationmodel leads to an inaccuracy. But in the second stage with these twomembranes during the first period of the filtration, error was lessbecause of dilution of the feed to the filtration unit and after that itgoes through a maximum value. In the 100 kg mol−1 membrane even

Fig. 14. Variation of the squared error between predicted %purity and experimental %purity with time at different stages of diafiltration for the 5 kg mol−1 UF membrane atΔP=0.15 MPa and ζ=0.4 (—: Squared error for the FPM; ——: Squared error for thePHNN; and ······: Squared error for the PHNN).

because of this adsorption the squared error value is almost eighttimes more than the value obtained at the end of the filtration withundiluted feed. In the 50 kg mol−1 membrane when the feed wasintroduced first it was almost three times diluted. But as the MWCOwas reduced, therefore the permeation of the microsolutes wasrestricted by its lower MWCO. Therefore the resistance was notoffered only by the concentration polarisation layer and hence limitsthe validity of themodel in predicting the percentage purity. But as thedilution got increased in the 50 kg mol−1 membrane, therefore thesquared error term got reduced here compared to the 100 kg mol−1

membrane. In 5 kgmol−1 this squared error value was almost 16 timesless than the 100 kg mol−1 and 50 kg mol−1 on an average. But in itssmall scale it resembles the deviations that were obtained in the case ofthe 100 kg mol−1 and 50 kg mol−1 membranes. Table 3 shows theanalysis of the FPM model for each of the separation stages. It is foundthat the probability value for making Type I error (rejecting nullhypothesis even if it is true) with experimental and model predictedvalues, is least for the 50 kg mol−1 MWCO membrane. For the 100 kgmol−1 membrane it is around 3% (very close to the 5% confidenceinterval) and for the 5 kg mol−1 membrane it is around 91%. Actuallyapart from the assumptions on protein adsorption or any othermembrane–protein interactions, MWCO, which could be another factorfor rejection, was not considered during the formulation of the model.This effect became more pronounced when the MWCO of themembrane was almost reduced by 10 times from the previousmembrane step i.e.with the 5 kgmol−1membrane. These discrepancieswere minimised by the adoption of the neural network either in seriesor in parallel to the FPM. In Figs. 12 through 14 it could be observed thatthe FPM in conjunction with series or parallel shows a considerabledecrease in the squared error term.

During the training procedure of the neural network in the PHNN,weights of the neural network, including the weights of the biases arelimited to 21 so that data to parameter ratio (χ) will be 3.5 definingthe upper limit for the number of weights during the trainingprocedure [14]. In the case of the SHNN this factor was limited to 4.8.Table 4 presents an overall analysis of the HNN, showing that in mostof the cases accuracy of the SHNN is more prominent compared to thePHNN if the number of iterations, and hence the process time, iscompromised. Especially in the case of the 5 kg mol−1 membrane thevalue of R2 is 0.977 for the SHNNwhereas it is 0.897 for the PHNN. Thisdiscrepancy arises because the architecture followed is different in thetwo cases. As discussed above, the assumption regarding the existenceof a film layer was made in predicting the macrosolute concentrationduring the evaluation of the purity percentage using the FPM. ThePHNN utilises the residuals calculated on the purity percentage valuesusing the FPM, which acts as the target set for the neural network.Therefore, the prediction from the PHNN holds the effect of theassumptions implicitly, which compromises the accuracy of theneural network as compared to the SHNN. In the case of the SHNNthe macro- and microsolute concentrations on the membrane werecalculated using the neural network without making any priorassumptions and ultimately fed to the FPM to evaluate the puritypercentage. Thus, there is a considerable gain in the accuracy of themodel. Similar observation on the accuracy level of these twonetworks is found in the case of the 100 kg mol−1 membrane. Butin this case the assumption of the film layer holds good partially,showing a comparable goodness of fit for the PHNN as well as the

Table 3Analysis of the FPM model in the parallel HNN for each stage of the membrane.

Membrane module R2 P (αType I=0.05)

100 kg mol−1 0.621 0.0310350 kg mol−1 0.836 0.000695 kg mol−1 0.040 0.91195

image of Fig.�13

image of Fig.�14

Table 4Analysis of the HNN for each stage of the membrane.

Membranemodule

HNN Type R2 Structure(Ie-Hf-Og)

Weights χ Process time(CPU time in s)

100 kg mol−1 PHNN 0.985 3-4-1 21 3.5 7200SHNN 0.990 4-3-1 19 4.8 12,000

50 kg mol−1 PHNN 0.992 3-4-1 21 3.5 7800SHNN 0.987 4-3-1 19 4.8 11,700

5 kg mol−1 PHNN 0.897 3-4-1 21 3.5 7200SHNN 0.977 4-3-1 19 4.8 13,800

e Number of input nodes/neurons.f Number of hidden nodes/neurons.g Number of output nodes/neurons.


SHNN. Whereas, in the case of the 50 kg mol−1 membrane, the PHNNpredictions were quite good and is comparable to the SHNNprediction as is explicated by the close values of R2. Anotheradvantage of the PHNN was its less process time compared to theSHNN.

6. Conclusion

In the present work an attempt was made to combine themechanistic model and ANN (a black-box model) for the predictionof the percentage purity in the ultrafiltration of whey proteins.Especially, the present study shows the efficacy of a combinatorialhybrid model when UF was carried out in conjunction with thedilution mode (diafiltration). In this regard, a comparative study wasmade with the two types of architecture of the hybrid neuralnetwork — the series HNN and the parallel HNN. The total control ofthe hybrid network performance was guided by the assumptionsmade during the formulation of the FPM, especially in the case of theparallel system. The process time of the simulation was found to beless in the parallel HNN compared to the series HNN. It can be inferredfrom the present work, that the accuracy of the model is somewhatcompromised due to the erroneous assumptions in the case of thePHNN compared to the SHNN. In all cases, for membranes of differentMWCOs, the SHNN shows an absolute average deviation (AAD) of15%, whereas the PHNN shows a deviation of 22–27%, establishing astrong support for the adoption of the series approach for KBHNN inmembrane separation processes. But in some cases where the processdynamics was accurately found the PHNN could be a good alternativeto the SHNN considering less process time with the PHNN.

Nomenclature
Vm Volume of the retentate (m3)VP,m Volume of the permeate (m3)V0 Initial volume of the feed (m3)Vw Volume of the make-up water (m3)CBulk,Micro,m Microsolute concentration in the retentate (kg m−3)CP,Micro,m Microsolute concentration in the permeate (kg m−3)C0,Micro Initial microsolute concentration in the feed (kg m−3)Cw,Micro,m Microsolute concentration at the wall of the membrane
(kg m−3)CBulk,Macro,m Macrosolute concentration in the retentate (kg m−3)CP,Macro,m Macrosolute concentration in the permeate (kg m−3)C0,Macro Initial macrosolute concentration in the feed (kg m−3)Cw,Macro,m Macrosolute concentration at the wall of the membrane

(kg m−3)t Time of the filtration process (s)ki,Micro,avg Average mass transfer coefficient for microsolute (m s−1)ki,Macro,avg Average mass transfer coefficient for macrosolute (m s−1)Sh Sherwood numberRe Reynolds numberSc Schmidt number

P %Purity of proteinTOL Tolerance value

Greek symbolsζ Volume reduction ratio = Vm ðt Þ=V0

SubscriptsP PermeateMicro MicrosoluteMacro Macrosolutem Diafiltration stage number0 Parameter at initial timeNN Neural networkFPM First principle modelPHNN Parallel hybrid neural networkSHNN Series hybrid neural networkExp Experimental

Acknowledgements

Thework reported in this article is part of an Indo-Australian project(vide sanction letter no. BT/PR9547/ICD/16/754/2006 of DBT/Indo-Aus/01/35/06 dated July 02, 2007), jointly funded by DEST (Australia) andDBT (India). The project involves collaboration between University ofMelbourne and Burra Foods (Australia), and Jadavpur University and IITKanpur (India). Accordingly, the contributions of DEST (Australia) andDBT (India) are gratefully acknowledged.

References

[1] L. Masse, D.I. Massé, Y. Pellerin, J. Dubreuil, Osmotic pressure and substrateresistance during the concentration of manure nutrients by reverse osmosismembranes, J. Membr. Sci. 348 (2010) 28–33.

[2] G. Zhang, Z. Liu, L. Zhao, H. Li, Q. Zhou, F. He, Z. Xu, H. Wang, Recovery of glutamicacid from ultrafiltration concentrate using diafiltration with isoelectric super-natants, Desalination 154 (2003) 17–26.

[3] G. Foley, Water usage in variable volume diafiltration: comparison withultrafiltration and constant volume diafiltration, Desalination 196 (2006)160–163.

[4] A.F. Vikbjerg, G. Jonsson, H. Mu, X. Xu, Application of ultrafiltrationmembranes forpurification of structured phospholipids produced by lipase-catalyzed acidolysis,Sep. Purif. Technol. 50 (2006) 184–191.

[5] M. Arabelle, D. Georges, C. Bernard, Ultrafiltration modes of operation for theseparation of α-lactalbumin from acid casein whey, J. Membr. Sci. 153 (1999) 9–21.

[6] G. Williams, S. Nakao, C.A. Smolders, Flux limitation in ultrafiltration: osmosticpressure model and gel layer model, J. Membr. Sci. 20 (1984) 115–124.

[7] A.G. Fane, Ultrafiltration: factors influencing flux and rejection, in: R.J. Wakeman(Ed.), Progress in Filtration and Separation, vol. IV, Elsevier, Amsterdam, 1986,pp. 101–179.

[8] P.Aimar,R. Field, Limitingflux inmembrane separations: amodelbasedon theviscositydependency of the mass transfer coefficient, Chem. Eng. Sci. 47 (1992) 579–586.

[9] S. Srebnik, Polymer adsorption on multicomponent surfaces with relevance tomembrane fouling, Chem. Eng. Sci. 58 (2003) 5291–5298.

[10] G. Serpen, D.K. Tekkedil, M. Orra, A knowledge-based artificial neural networkclassifier for pulmonary embolism diagnosis, Comput. Biol. Med. 38 (2008)204–220.

[11] C. Bhattacharjee, D. Sen, P. Sarkar, S. Datta, P.K. Bhattacharya, Studies on theapplication of different ANNs to predict permeate flux in rotating disk membranemodules: a case study with MATLAB™, Desalin. Water Treat. 2 (2009) 170–184.

[12] S. Curcio, V. Calabro, G. Iorio, Reduction and control of flux decline in cross-flowmembrane processes modeled by artificial neural networks and hybrid systems,Desalination 236 (2009) 234–243.

[13] S. Curcio, G. Scilingo, V. Calabro`, G. Iorio, Ultrafiltration of BSA in pulsating conditions:an artificial neural networks approach, J. Membr. Sci. 246 (2) (2005) 235–247.

[14] G.M. Bollas, S. Papadokonstadakis, J. Michalopoulos, G. Arampatzis, A.A. Lappas, I.A. Vasalos, A. Lygeros, Using hybrid neural networks in scaling up an FCC modelfrom a pilot plant to an industrial unit, Chem. Eng. Process. 42 (2003) 697–713.

[15] M.L. Thompson, M.A. Kramer, Modeling chemical processes using prior knowl-edge and neural networks, AIChE J. 40 (8) (1994) 1328–1340.

[16] S. De, P.K. Bhattacharya, Flux prediction of black liquor in cross flow ultrafiltrationusing low and high rejecting membranes, J. Membr. Sci. 109 (1996) 109–123.

[17] M.M. Bradford, Rapid and sensitive method for the quantitation of microgramquantities of protein utilizing the principle of protein-dye binding, Anal. Biochem.72 (1976) 248–254.


[18] N.Z. Ballin, Estimation of whey protein in casein coprecipitate andmilk powder byhigh-performance liquid chromatography quantification of cysteine, J. Agric. FoodChem. 54 (2006) 4131–4135.

[19] B. Schirmer, A. Melling, F. Durst, A new method for the determination ofmembrane permeability by spatially resolved concentration measurements,Meas. Sci. Technol. 15 (2004) 195–202.

[20] P. Georgieva, S.F. de Azevedo, Application of feed forward neural networks inmodeling and control of a fed-batch crystallization process, Proc. World Acad. Sci.Eng. Technol. 12 (2006) 65–70.

[21] M.J. Kamali, M. Mousavi, Analytic, neural network, and hybrid modeling ofsupercritical extraction of α-pinene, J. Supercrit. Fluids 47 (2008) 168–173.

[22] J.S. Anderson, T.J. McAvoy, O.J. Hao, Use of hybrid models in wastewater systems,Ind. Eng. Chem. Res. 39 (2000) 1694–1704.

[23] D. Kuzmenko, E. Arkhangelsky, S. Belfer, V. Freger, V. Gitis, Chemical cleaning of UFmembranes fouled by BSA, Desalination 179 (2005) 323–333.

[24] S. Salgın, S. Takaç, T.H. Özdamar, Adsorption of bovine serum albumin onpolyether sulfone ultrafiltration membranes: determination of interfacial inter-action energy and effective diffusion coefficient, J. Membr. Sci. 278 (2006)251–260.

Documents

Development of a knowledge based hybrid neural network (KBHNN) for studying the effect of diafiltration during ultrafiltration of whey