https://doi.org/10.31449/inf.v45i4.3503 Informatica 45 (2021) 605–616 605 A Blockchain and NLP Based Electronic Health Record System: Indian Subcontinent Context Pranab Kumar Bharimalla School of Computer Engineering, KIIT University, India E-mail: pranab.bharimalla@gmail.com Hammad Choudhury Infosys Ltd., USA E-mail: hchoudhury99@gmail.com Shantipriya Parida Idiap Research Institute, Martigny, Switzerland E-mail: shantipriya.parida@idiap.ch Debasish Kumar Mallick and Satya Ranjan Dash (Corresponding Author) School of Computer Applications, KIIT University, India E-mail: mdebasishkumar@gmail.com, sdashfca@kiit.ac.in Keywords: blockchain, electronic health record, ResNet, CNN, natural language processing Received: April 8, 2021 The healthcare system in the Indian subcontinent is plagued with numerous issues related to the access, transfer, and storage of patient’s medical records. The lack of infrastructure to properly communicate and track records between all key participants has allowed the distribution of counterfeit drugs, dependency on unsafe methods of communication, and lack of trust between patients and providers. During the global COVID-19 pandemic, the need for a robust communication and record tracking system has been further emphasized. To facilitate efficient communication and mitigate the mentioned issues, a nationwide EHR (electronic health record) system must be introduced to bring the healthcare system into digital space. To further enhance security, efficiency, and cost, the innovation of Blockchain is introduced. Blockchain is a decentralized data structure that allows secure transactions between untrusted parties without needing a central authority. In this paper, a Hyperledger fabric-based Blockchain Electronic Healthcare Record (EHR) system is proposed. The system is integrated with technologies such as NLP (Natural Language Processing), and Machine Learning to provide users with practical features. Povzetek: Predstavljen je elektronski zdravstveni zapis na osnovi bloˇ cnih in NLP tehnologij v kontekstu Indije. 1 Introduction The SARS-CoV-2 or commonly known as the Coronavirus pandemic, has challenged healthcare systems worldwide and has exposed vulnerabilities even among the best pre- pared due to its uncertainty of transmission, the unavail- ability of a patient’s proper medical history, and lack of adequate contact tracing, to name a few examples. Track- ing a threat such as this pandemic requires dynamic adap- tation of resource deployment to manage rapidly evolving care demands, ideally based on real-time data from a large population sample. Healthcare issues plague all nations re- gardless of development status due to environmental, eco- nomic, or societal conditions; according to The Institute of Medicine (IoM), over one hundred thousand people die each year from preventable medical errors in the US [2]. While healthcare systems in developed nations are in no way perfect as stated by IoM and proven by the pandemic, developing or underdeveloped regions such as the Indian sub-continent remain more vulnerable. Some common is- sues that have arisen are due to the lack of communication and tracking infrastructures, such as the influx of coun- terfeit drugs in the market, dependability on handwritten prescriptions, especially in remote areas lacking any com- puter systems, and almost nonexistent integration between healthcare and insurance systems. The inadequate infras- tructure facilitates the lack of accountability for healthcare providers and further damages relations with patients. Re- solving the many health care issues faced in the Indian subcontinent is a formidable challenge but can be signifi- cantly improved with the implementation of an end-to-end integrated Electronic Health Record (EHR). In comparison to paper-based record keeping, a practice still utilized in the Indian subcontinent, EHR has clear and distinct advan- tages [7]. It is fair to say that EHR holds a lot of promise: lower morbidity and mortality rates, better continuity of 606 Informatica 45 (2021) 605–616 P.K. Bharimalla et al. care, increased efficiencies, fewer adverse drug reactions, and, most importantly, lower healthcare costs. Paper-based records are more susceptible to human error due to funda- mental factors such as legibility or loss of the physical item, causing a delay in treatment and possible fatality, which could have been prevented[18]. Also, the impact of EHR will undoubtedly be felt during the coming months as the global effort to distribute and administer the Coronavirus vaccine intensifies. An ideal healthcare system should be open, afford- able, innovative, and secure. Care costs should not ham- per patients from receiving required treatments or buy- ing medicines, and healthcare should be affordable to all regardless of wealth[19]. In the current scenario, an ideal healthcare system might seem far-fetched, but there is always room for improvement using the latest innovations[13]. As our digital infrastructure evolves, the need for robust privacy is increasing. Because of the sensi- tivity of an individual’s healthcare information, a breach in the system could jeopardize the identity of a patient and the reputation of providers. According to [10], the patient’s data contains information that is highly prized by cybercriminals. Few noticeable hacks of medical in- formation such as AMCA Data Security Incident (ap- prox. 25M records) and Anthem data breaches (approx. 80M records) caused enormous damage to the medical system[20]. Blockchain technology could help secure and protect sensitive patient information and is emerging as an alternative to the conventional way to log transactions and transfer data through a trusted intermediary to provide va- lidity to the transaction [3]. So far, we have discussed various healthcare issues pre- vailing in the Indian subcontinent, such as the lack of complete end-to-end EHR interlinking between individu- als, stand-alone hospital recording systems, extensive use of handwritten prescriptions, unsafe hospital databases vul- nerable to data manipulation by hospital authorities with- out patient permission and lack of accommodation for caregivers. We also understood how a Blockchain-based EHR system can be a perfect solution to these problems. This article proposes a patient-centered blockchain-based EHR system that identifies patients using their national ID, grants individual control over their health records, protects against unauthorized data manipulation, and allows scal- ability. Cloud integration stores large CT scans and X- ray reports. An EHR patient-centered system must accu- rately identify individuals using unique identifiers. There is no end-to-end EHR system with national ID cards like Aadhar or PAN (Permanent Account Number). Integrating unique identifiers like national IDs is essential to success- fully identify patients. The proposed scalable system will enable existing EHR and healthcare databases to be inte- grated along with other useful features, including a mobile app-based interface to convert paper prescriptions to text using Natural Language Processing (NLP) algorithms to bring old paper-based medical records into the new system. The paper is organized as follows: Section 2 briefly de- scribes some related work and their major contributions as well as research gaps. The proposed system architecture is presented in Section 3, followed by the proposed algo- rithms. Section 4 discusses the implementation strategy as well as the analysis of the results. Finally, Section 5 pro- vides the conclusion and suggestions for future research. 2 Related work With the advent of different Blockchain platforms like Hy- perledger Fabric, Ethereum, and Azure Blockchain Work- bench, many patients centric, permission-based Blockchain EHR schemes have been proposed in the literature. [17] proposed a Permission-based EHR sharing system intend- ing to enhance security and privacy. They also proposed a design access control policy algorithm with a smart con- tract and formulated a performance optimization mech- anism of the system. However, this work ignores the reusability of existing healthcare records sitting in individ- ual hospital databases. [8] proposed a Blockchain-based permissioned EHR system with the capability of data in- tegration of local standalone EHR systems that house in different hospitals or clinics. The framework proposed to store metadata and access only in Blockchain whereas ac- tual health records in the cloud. This is a novel concept, but actual patient-sensitive EHR data resides in the cloud and does not enjoy the immutability that Blockchain of- fers. [15] tried to enhance the framework proposed in [17]; and added few new modules like a chemist, insur- ance, and doctor’s appointment. Even though the authors formulated a comprehensive approach,the work ignores the scalability of data and interoperability of existing EHR and healthcare databases. [6] proposed a similar Blockchain- based searchable encryption scheme for EHR that not only brings convenience to patients, healthcare providers but also to researchers. In the proposed system only, indexes are added to the Blockchain, whereas actual patient in- formation is stored in an encrypted format in the cloud. There are different ways of encrypting data in the cloud and achieving privacy preservations. [9] defined a novel way of splitting EHR records into sub-messages and fi- nally construct shares of EHR to store in different com- puter nodes locally and upload the indexes in healthcare Blockchain. [3, 11] evaluated the performance for common public and private/consortium Blockchain-based health- care systems using metrics such as memory consumption, disk write and read performance, network data utilization, transaction execution per unit time, and CPU usage with consortium-based systems yielding the best performance results. [4] proposed a framework called Blockchain-Based Deep-Learning as a-Service. The framework shares EHR records among multiple healthcare users and operates in two phases to prevent collusion attacks through authenti- cation and predicts possible future conditions for patients through deep learning. [5] proposed a Blockchain-based architecture that allows access to the database based on user A Blockchain and NLP Based Electronic Health Record System. . . Informatica 45 (2021) 605–616 607 roles and enhances the traditional encryption system em- ploying the Quantum blind signature to protect the system from quantum attacks using hyperledger fabric. On the other hand, [14] proposed a system that uses AES cryptography to perform the cryptic operation and block chaining it through the hash keys. In addition to that, their proposed healthcare ecosystem includes a prediction model to diagnose the disease of the patient with the deep learn- ing algorithm. In addition to the related works mentioned above, we also performed literature surveys on the ANN- based text extraction model, which could contribute to our work. [1] have implemented the Artificial Neural Network (ANN) approach for text extraction from 64 different types of prescriptions with 98% accuracy. [12] have proposed a CNN common approach for extracting numeric text us- ing handwritten numbers commonly found in India, includ- ing Odiya, Telugu, Devnagari, Bangla, and English. The Bangla characters were 95% accurate, Devanagari charac- ters were 98.54% accurate, Odia characters were 97.2% ac- curate, Telugu characters were 96.5% accurate, and English characters were 99.10% accurate. 3 Proposed model The proposed system prototype is based on Hyperledger Fabric, an open-source distributed ledger technology built to meet enterprise requirements. It is a widely used private Blockchain option. Being a permissioned platform with improved configurability and modularity using pluggable consensus protocols, it is ideal for a range of industries requiring a level of trust between known participants in a governance model. For a transaction to occur, the user must be admitted with the organization’s certificate authority, received the means for network authentication, the chain code must be deployed to the channel and installed on the peers, and both parties must have agreed to the endorse- ment policy. A transaction proposal using an SDK is con- structed along with a signature to call a chain code function with parameters to update the ledger. After approvals from peers are received, the chain code is executed against the current database, and the response, read set, and write set are received; these values and the signature are sent back to the SDK to be parsed and consumed by the application. Once the application has validated the responses, the pro- posal and response are bounded in a transaction message for the ordering service; the ordering service creates blocks of transactions per channel. The transactions are sent to all peers in the channel, the peers complete a final valida- tion, the ledger is updated, and the peer reports to the client about transaction validation or invalidation. 3.1 System architecture In the proposed patient-centric, Blockchain-based health- care system, there are seven key participants, the patients or the public, doctors or caregivers, pharmacies, labs, in- surance company staffs, government institutions, and the admin user. Assistance from the government is neces- sary to implement a functional Blockchain-based EHR sys- tem. The government provides credibility to insurance companies, labs, pharmacies, doctors, and even patients through national identification numbers. The proposed sys- tem provides patients substantial control over their medical records, including the right to read, write, authorize, and revoke records in the Hyperledger Fabric Blockchain net- work. Doctors work closely with patients to diagnose con- ditions, plan treatments, and prescribe medication. Phar- macies work in parallel with doctors to distribute medica- tions to patients. Due to the many factors related to an ac- curate diagnosis, labs are specialized in detecting distinct conditions with the help of specific tools and trained pro- fessionals. Insurance companies help share risk between a large population, making the cost of healthcare affordable for the public, especially during unexpected events such as accidents. Doctors, pharmacies, labs, and insurance com- panies can read and update the patient’s medical record in the Hyperledger Fabric Blockchain network if access has been provided. The admin is critical for system mainte- nance, and they have unrestricted access to the system, in- cluding the right to read, write, update, remove and grant access to participants in the Hyperledger Fabric Blockchain network. The admin’s enrollment certificate is obtained from the certification authority. Implementing a national health portal or EHR system is not possible without the government’s involvement in India due to the sizable pop- ulation of 1.4 billion. In the proposed system, the govern- ment institution acts as a founder organization or a trusted anchor that can provide credibility to hospitals, pharma- cies, insurance companies, labs, and other institutions and provide them with trusted roles to play. The participants with trusted roles can create and issue credential schema and definition to the public or patients The participants could register through a client applica- tion or SDK and request an enrollment certificate from a Membership Service Provider (MSP) to the certificate au- thority. An MSP allows peers to validate incoming trans- actions and sign off endorsements. After receiving the en- rollment request, the certificate authority issues the certifi- cate and private key with a new ID to enroll the participant. The Hyperledger Fabric Blockchain network distributes all transactions. Participants such as doctors, pharmacies, labs, insurance companies, etc., have different roles in the system and are only granted access when authorized. In the proposed system, an individual’s identity could be ver- ified against a national identification number such as Aad- har and be structured to contain identity information like names, date of birth, gender, and other identifying infor- mation which could be fetched from the Aadhar database. Patients can use the client application to update details like blood group, allergies, medications, insurance details, etc. Once the transaction is submitted, it will be broadcasted to the network. Endorsing peers will verify the transac- tion and authenticate using their certification and private key. The transaction will next go to the Orderer through 608 Informatica 45 (2021) 605–616 P.K. Bharimalla et al. the SDK client. The Orderer creates a block and sorts the blocks based on different ordering algorithms (viz crash fault-tolerant) and broadcasts to the network peers. All the committing peers validate the blocks once again and check if it is from the correct Orderer and validate con- flicts before committing. MS is the body that manages the network identities of organizations and users; however, it does not have access to medical records on the Blockchain network. The MS verifies participants based on TIN/PAN before enrolling them within the network. CouchDB, a NoSQL database that stores data in JSON-based format, is a popular database option used alongside Hyperledger Fabric. A network is comprised of peer groups that hold ledgers and smart contracts used to encapsulate shared net- work processes and information. In Hyperledger Fabric, transactions produced by smart contracts are contained in a chain code. The key components and processes of the proposed system are highlighted in the below architecture Figure 1. The algorithm 1 explains enrollment of patients whereas algorithm 2 refers to hospital, pharmacy, etc. addi- tion to the network. Table 1 details the abbreviations used in both the algorithms. 3.2 Data pulling and sharing In a permissioned blockchain system, the patients have the authority to decide who can read or update their records; however, the admin reserves the right to grant access to an institution in case of an emergency. Individual hos- pitals will integrate their EHR system with a Blockchain node and a web API that has full access to their local EHR to convert any existing SQL records to No SQL for- mat for storage within Blockchain. A hybrid data manage- ment approach is utilized to facilitate EHR data scalability where all key patient information, including demograph- ics, allergies, medications, and access controls, are stored in Blockchain, and sensitive medical files such as x-ray and scanning reports are stored in private cloud storage using encryption in Figure2. In the proposed system, a patient can provide access to institutions and participants, including hospitals, doctors, insurance companies, and pharmacies, through the user in- terface using the web or mobile app. The patient will need to identify the participant requiring access, the category of data to be shared, and the period until the data is accessi- ble. If a patient has visited a particular institution in the past and there are medical records contained in the local EHR system not found on the Blockchain network, they can ini- tiate a pull request using web apps. Once the pull request is approved by the institution’s admin user, the web apps will connect to the local EHR system to fetch relevant data, insert patient information into the Blockchain network and upload large files to the private cloud in an encrypted for- mat. 3.3 Patient data management at hospital During a routine or emergency hospital visit, the patient provides the hospital access to their medical records on the Blockchain network to check and amend their records based on the latest assessment. The Hospital must have a valid node on the Blockchain network and request keys from the network admin to permit the login. The patient will select the category of data to be shared and how long the Hospital will have access to the records. Once the Hos- pital has been provided access, they can read and update the records for an individual using their Aadhar identification card described in Figure3. 3.4 Patient data management at pharmacy, patho lab and insurance firm During a pharmacy visit to fulfill a prescription, the pa- tient provides the pharmacy access to their medical records on the Blockchain network, assuming the pharmacy has a valid node on the network and has requested keys from the network admin to enable login. The patient can provide itemized permissions where the pharmacy will only have access to select prescriptions through private datasets and control how long the pharmacy has access to the records. Once the pharmacy has been provided access, they can read and update the records for an individual using their Aadhar identification card. During a lab visit at a specialized facility to assess spe- cific conditions, the patient provides the lab access to their medical records on the Blockchain network. The lab re- quires a valid node on the network and must have requested keys from the network admin to allow login. The patient controls what data is to be shared and how long the lab will have access. Once the lab has been provided access, they can read and update the records for an individual us- ing their Aadhar identification card. Large lab files such as x-ray reports can be encrypted and uploaded to private cloud storage. Similarly, Insurance firms need to update a patient’s insurance and policy information for policy pro- curement or medical expense claims. The patient provides the insurance firm access to their medical records on the Blockchain network if the institution has a valid node on the network and has requested keys from the network ad- min to authorize a login. The patient will select the type of data to be shared and how long the insurance firm will have access to the record. Once access has been provided, they can read and update the records for an individual using their Aadhar identification card. 3.5 Patient uploads old handwritten/ printed prescriptions and bills A core proposal of this paper is to convert paper prescrip- tions to text using Natural Language Processing (NLP) al- gorithms to bring old paper-based medical records into the new system through a mobile app-based interface. In the A Blockchain and NLP Based Electronic Health Record System. . . Informatica 45 (2021) 605–616 609 Figure 1: Proposed architectural Framework for Blockchain-based Healthcare System. It depicts key participants, com- ponents and transaction processes involved. Figure 2: Secured Hospital data management and data pulling process from local EHR. below module in Figure 4, the system flow for NLP-based data extraction is presented, highlighting the key compo- nents. 3.5.1 Handwritten prescription data extraction Convolutional Neural Network (CNN) CNN is a Neu- ral Network (NN) that performs convolutional operations instead of simple matrix multiplication operations; it is one of the layers. The structure of CNN consists of a Convolu- tion layer, pooling layer, and fully connected layer. Feature extraction operations are done in the convolution layer, and the output of it is passed to the activation function. The size of output reduces by pooling layers and gives robust learn- ing results for input data. By performing the convolution layer and pooling layer multiple times, global features can be obtained. In the end, extracted features are passed to the fully connected layer for regression and classification. Residual Network (ResNet) According to the image processing research, the number of layers or depth of a network is crucial for the performance of a model, but the greater number of layers is responsible for the degradation. Much research shows those types of degradation are not caused by overfitting but due to the matter of optimization. The ResNet can solve the degradation problem by intro- ducing a residual framework. Long Short-Term Memory (LSTM) RNN is also a Neural Network that is specially designed for the process- ing of sequential data. In time-series data, the output t - 1- time step affects the decision of future time step t. So, RNN is not able to solve long sequence data. i.e., is called a van- ishing or exploding problem. LSTM was designed to re- solve this issue of vanishing or exploding problems. LSTM has an internal memory cell called a cell state. This gets the previous output and determines which element should be updated, erased, and maintained in the internal state vector. These processes are handled by four gates, forget gate ft, output gate ot, input gate gt. which are shown in Figure 5. The proposed model consists of Resnet-50 and three LSTM layers. To do no-linearity, Rectified Linear Unit (RELU) activation function is used in every convolutional layer. The images are divided into 28 sub-windows, so the image height is equal to the height of text-line images. The vector map will be produced by the last layer of convolu- tion. The output of the last convolution fed into the first 610 Informatica 45 (2021) 605–616 P.K. Bharimalla et al. Figure 3: High-level access flow diagram: The patient can add own records to the Blockchain or grants access to other participants like Hospital, Laboratory, Pharmacy or Insurance company to add/update records. Figure 4: NLP based data extraction flow showing the com- ponents. LSTM layer. By doing LSTM operations, the weights are optimized. Initially, 0.001 is used as the rate of learning. 3.5.2 Printed prescription data extraction For extracting printed prescription data, we have used the Google tesseract model,[16] a pre-trained character made by Google. For the extraction process, we have done some preprocesses. For a more efficient model, first, we convert the image into grayscale. After the greyscale operation, the Otsu thresholding is next, where pixels are converted into zeros and ones. During thresholding, some of the pixels Algorithm 1: Patient Enrollment. Input: Enrollment Certificate (E C ) from Certification Authority (C A ) Output: Successful registration of patient Initialization:N Admin should be valid node. N Admin can Write/Read/Remove/Update patients; while (true) do if P AId is validand FetchAadharRecord(P AId ) not null then P Rec FetchAadharRecord(P AId ) Add Patient (B NT ,P AId ) Grant Access(P AId ) Create Record (P AId ,P Rec ,B NT ) else Invalid(P AId ) end bool chk (0 :malicious; 1 :genuine) if !(behaviour(chk)) then Remove Update (P AId ) else Add Update (P AId ) end end may be lost. To restore those pixels, Erosion and Dilation operations are performed where Erosion expands some pix- els, and Dilation shrinks some pixels as shown in Figure 6a, Figure 6b and Figure 7. After these images complete preprocessing, the image will be sent to Google-Tesseract. The tesseract will extract all text from the image and send it to the network. A Blockchain and NLP Based Electronic Health Record System. . . Informatica 45 (2021) 605–616 611 Figure 5: Resnet-LSTM approch for Text Extractions. (a) Erosion Operation. (b) Dilation Operation. Figure 6: Pixel restoration using Erosion and Dilation op- erations. 4 Implementation and result analysis 4.1 Blockchain network setup To realize the proposed architecture, hyperledger fabric and Sandbox are utilized. Hyperledger is authentication and distributed ledger-based platform. It is an open-source technology used to implement different smart contracts with constraints and logic over the network for applica- tions. The smart contracts are implemented over the net- work using the sandbox module. In Sandbox, the partic- ipants are known, and the Blockchain is in the permis- sioned consortium mode, making it a secure and trusted Blockchain. The proposed architecture is not limited to Figure 7: Some pixels as missed, and after the Erosion and Dilation operation. the healthcare domain. Programming languages such as Node.js, Java, Go, etc., are used for contract and busi- ness network development. Docker is used for setting up and initialization when working with hyperledger fab- ric and composer. Docker is an operating system-level container used by developers, system administrators, etc., for creating, deploying, and running business networks or hyperledger-based applications in a container, enabling the dependencies and functionalities to be packaged together. The hyperledger fabric and composer network can run in- side a container using Docker. In our simulation phase, we used a network model of 3 organizations with 2 peers each and one Orderer. The experiment is carried out with basic writing transactions at various rates, with 1000 transactions per round at 50, 100, 150, 200, and 250 transactions per second. The experiment is done for 1 Org 1 Peer, 2 Org 2 Peer, and 3 Org 3 Peer with different performances of transactions. The results are calculated over five rounds, with each round consisting of 1000 transactions at various transaction rates per second 612 Informatica 45 (2021) 605–616 P.K. Bharimalla et al. Algorithm 2: Hospital, Pharmacy, Patho Lab and Insurance firm Enrollment. Input: Enrollment Certificate (E C ) from Certification Authority (C A ) Output: Access to all nodesH N ,P N ,L N ,I N Initialization:N Admin should be valid node. N Admin can Read/Write/Update/Remove participantsH N ,P N ,L N ,I N ; while (true) do if H N is validandH TIN is valid then Add Node (B NT ,H N ) Grant Access(H N ) else Invalid(H N ) end if P N is validandP TIN is valid then Add Node (B NT ,P N ) Grant Access(P N ) else Invalid(H N ) end if L N is validandL TIN is valid then Add Node (B NT ,L N ) Grant Access(L N ) else Invalid(L N ) end if I N is validandI TIN is valid then Add Node (B NT ,I N ) Grant Access(I N ) else Invalid(I N ) end end bool chk (0 :malicious; 1 :genuine) if !(behaviour(chk)) then Remove Update (H N ,P N ,L N ,I N ) else Add Update (H N ,P N ,L N ,I N ) end (tps). The graphs in figure 8a and 8b highlight the average la- tency and throughput for varying transaction rates along with the number of transactions completed per minute per three network models. The network model using 1 Org and 1 Peer had the lowest average latency and the highest throughput per transaction rate while completing the high- est number of tasks per minute. In contrast, the network model using 3 Org and 3 Peer had the highest average la- tency and the lowest throughput per transaction rate while completing the lowest number of tasks per minute. The network model using 2 Org and 2 Peer fell in between the above results. (i.e 3org3peer > 2org2peer > 1org1peer). So, we can conclude from figure 8a, that latency increased as the system scaled up with more organizations and more peers. Throughput of 1 org 1 peer is measured to be high- est of 190 whereas it keeps decreasing with the number of organization and peer increases. For 2 org 2 peer, it was found to be 182, and the same for 3 org 3 peer was 180 in Figure 8b. Figure 8c shows successfully completed trans- actions per minute. 1 org 1 peer completed 5000 transac- tions in around 4 minutes whereas the 2 org 2 peers com- pleted 4500 transactions and 3 org 3 peers completed 4000 transactions at the same time. As a result, transaction time has been observed to increase in perfect sync with the orga- nization’s and peers’ growth. Figure 8d on the other hand, highlights the CPU consumption per network model. For different rates of transactions, the network models resulted in varying average CPU usage. We observed that among all peers, peer1.org1.example.com touched the highest CPU utilization at a transaction rate of 200 per sec. whereas peer0.org1.example.com recorded the lowest CPU utiliza- tion at a transaction rate of 100 per sec. Table 2, details other resource consumption parameters. With these exper- iment results, we move forward to our next set of experi- ments related to data extraction from prescriptions. 4.2 Handwritten prescription data extraction- training and validation In regions dominated by paper prescription usage, provid- ing the ability to transfer vast amounts of existing data into the digital space is essential. Allowing users to upload pre- scriptions simplifies the transition to an electronic system and populates the system with useful patient data given in Figure 9a, 9b and 10. The following experiments are re- lated to training and validation for data extraction from handwritten prescriptions. The first step of training re- quires the number of inputs, hidden layers, and output lay- ers. Twenty handwritten prescriptions of different classes are taken, including numeric characters, alphabetical char- acters, spaces, and punctuation. To improve image accu- racy or legibility, the prescription image may be taken by section by users, causing the model to be confused and less accurate. To overcome this issue, images are converted into small segments. For analysis of images 64x64 pix- els in size, 16 feature vectors are extracted from the feature A Blockchain and NLP Based Electronic Health Record System. . . Informatica 45 (2021) 605–616 613 Abbreviation Explanation P Rec Patient’s Record B NT Blockchain Network N Admin Admin Node P N ,H N ,L N ,I N Pharmacy, Hospital, Patho lab, Insurance firm nodes respectively P TIN ,H TIN ,L TIN ,I TIN Pharmacy, Hospital, Patho lab, Insurance firm tax Id’s respectively Table 1: Abbreviation used in the algorithm and its explanations. (a) Average Latency with Varying Transaction Rate. (b) Throughput with Varying Transaction Rate. (c) No. of Completed Transactions with Time. (d) Resource consumption. Figure 8: Measurement of different performance parameters.The results are calculated over five rounds, with each round consisting of 1000 transactions at various transaction rates per second (tps). Type Name CPU(avg) Memory(avg) Traffic-In Traffic-out disc Write Docker peer0.org1.example.com 36.6 284.5MB 10.4MB 4.5MB 4.2MB Docker peer0.org2.example.com 28.4 280.0MB 10.5MB 5.6MB 4.2MB Docker peer0.org3.example.com 25.1 275.5MB 9.8MB 9.8MB 4.2MB Docker Orderer.example.com 2.34 50.0MB 2.5MB 1.2MB 1.2MB Table 2: Resource consumption of various parameters. map. The feature vectors are produced by the convolution layer and are extracted from each sliding. The model is 614 Informatica 45 (2021) 605–616 P.K. Bharimalla et al. trained by 11,000 samples, with the epoch starting from 0 to 10000. The learning rate is 0.0005 when the size of the batch is 10. For this experiment, 2000 images are validated. (a) Sample handwritten Prescription. (b) Sample output. Figure 9: Handwritten prescription and sample output. CNN-LSTM ResNet-LSTM Train 89.1 90.3 Test 81.6 88.3 Table 3: Model Accuracy for Handwritten Data extraction approach. According to the above results, the training accuracy was improved after the 30th epoch. Finally, an 88.3% accuracy result was accomplished using Test Dataset given in table 3. The generate model will take an input and generate a text output which is pre-processed by string operations. After the string operations are completed, the produced output will be sent to the network. Figure 10: Training graph sample of Resnet-LSTM with output. 4.3 Printed prescription data extraction- training and validation The Tesseract-OCR is a pre-trained model created by Google. For improved image processing dilation, erosion meth- ods are applied to Otsu’s thresholding. This approach also provides better accuracy as mentioned by the Google tesseract research. [16]. In our case, the test data is result- ing in 99% accuracy. The illustration of the image is shown in 11. (a) Printed Prescription. (b) Sample output. Figure 11: Output of Tesseract OCR text extraction. 5 Conclusion and future work EHR systems will be of considerable importance to ad- vance the digital medical space of developing regions such as the Indian subcontinent. With the advancement of Blockchain technology, its potential has been recognized to significantly impact the future of EHR systems due to the superiority of Blockchain-based systems over traditional systems and paper-based record keeping. Blockchain- based EHR systems improve security, efficiency, and cost, making it an excellent option for the Indian subconti- nent. In this paper, we have highlighted some common A Blockchain and NLP Based Electronic Health Record System. . . Informatica 45 (2021) 605–616 615 issues that have arisen due to the lack of communication and tracking infrastructure, such as the influx of counter- feit drugs, dependability on handwritten prescriptions, and lack of integration between healthcare and insurance sys- tems. We have discussed solutions to these issues using Blockchain, NLP, Hyperledger Fabric and Docker Contain- ers, etc. The proposed scalable system will allow inte- gration of existing EHR and healthcare databases, national identification, cloud technology to store large files with en- cryption, and a mobile app-based interface to convert paper prescriptions to text using OCR and deep learning tech- niques, then to bring old paper-based medical records into the new system. Our future research will include address- ing implementation challenges at the grassroots level and also collect more samples for training, to increase the ac- curacy during converting the handwritten prescriptions into text. References [1] R. Achkar, K. Ghayad, R. Haidar, S. Saleh, and R. Al Hajj. Medical handwritten prescription recogni- tion using crnn. In 2019 International Conference on Computer, Information and Telecommunication Sys- tems (CITS), pages 1–5. IEEE, 2019. https:// doi.org/10.1109/CITS.2019.8862004. [2] A. Baker. Crossing the quality chasm: a new health system for the 21st century, volume 323. British Medical Journal Publishing Group, 2001. https: //doi.org/10.1136/bmj.323.7322.1192. [3] P. K. Bharimalla, S. Praharaj, and S. R. Dash. Ann based block chain security threat mecha- nism. International Journal of Innovative Tech- nology and Exploring Engineering (IJITEE), 8 (10), 2019. https://doi.org/10.35940/ ijitee.J9442.0881019. [4] P. Bhattacharya, S. Tanwar, U. Bodke, S. Tyagi, and N. Kumar. Bindaas: Blockchain-based deep- learning as-a-service in healthcare 4.0 applications. IEEE Transactions on Network Science and Engi- neering, 2019. https://doi.org/10.1109/ TNSE.2019.2961932. [5] M. Bhavin, S. Tanwar, N. Sharma, S. Tyagi, and N. Kumar. Blockchain and quantum blind signature- based hybrid scheme for healthcare 5.0 applications. Journal of Information Security and Applications, 56: 102673, 2021. https://doi.org/10.1016/ j.jisa.2020.102673. [6] L. Chen, W.-K. Lee, C.-C. Chang, K.-K. R. Choo, and N. Zhang. Blockchain based search- able encryption for electronic health record shar- ing. Future Generation Computer Systems, 95:420– 429, 2019. https://doi.org/10.1016/j. future.2019.01.018. [7] B. Devkota and A. Devkota. Electronic health records: advantages of use and barriers to adoption. Health Renaissance, 11(3):181–184, 2013. https: //doi.org/10.3126/hren.v11i3.9629. [8] A. Dubovitskaya, F. Baig, Z. Xu, R. Shukla, P. S. Zambani, A. Swaminathan, M. M. Jahangir, K. Chowdhry, R. Lachhani, and N. Idnani. Action- ehr: Patient-centric blockchain-based electronic health record data management for cancer care. Jour- nal of medical Internet research, 22(8):e13598, 2020. https://www.jmir.org/2020/8/e13598/. [9] J. Fu, N. Wang, and Y . Cai. Privacy-preserving in healthcare blockchain systems based on lightweight message sharing. Sensors, 20(7):1898, 2020. https://doi.org/10.3390/s20071898. [10] G. Gavrilov, O. Simov, and V . Trajkovik. Blockchain- based model for authentication, authorization, and immutability of healthcare data in the referrals pro- cess. 2020. http://hdl.handle.net/20. 500.12188/8179. [11] C. Kombe, A. Sam, M. Ally, and A. Finne. Blockchain technology in sub-saharan africa: Where does it fit in healthcare systems: A case of tanzania. Journal of Health Informatics in Developing Coun- tries, 13(2), 2019. https://www.jhidc.org/ index.php/jhidc/article/view/24. [12] D. S. Maitra, U. Bhattacharya, and S. K. Parui. Cnn based common approach to handwritten char- acter recognition of multiple scripts. In 2015 13th International Conference on Document Analysis and Recognition (ICDAR), pages 1021–1025. IEEE, 2015. https://doi.org/10.1109/ICDAR. 2015.7333916. [13] B. Mounia and C. Habiba. Big data privacy in healthcare moroccan context. Procedia Computer Science, 63:575–580, 2015. https://doi.org/ 10.1016/j.procs.2015.08.387. [14] R. Shanthapriya and V . Vaithianathan. Block- healthnet: security based healthcare system using block-chain technology. Security Journal, pages 1–19, 2020. https://doi.org/10.1057/ s41284-020-00265-z. [15] A. P. Singh, N. R. Pradhan, S. Agnihotri, N. Jhanjhi, S. Verma, U. Ghosh, D. Roy, et al. A novel patient- centric architectural framework for blockchain- enabled healthcare applications. IEEE Transactions on Industrial Informatics, 2020. https://doi. org/10.1109/TII.2020.3037889. [16] A. P. Tafti, A. Baghaie, M. Assefi, H. R. Arab- nia, Z. Yu, and P. Peissig. Ocr as a service: an experimental evaluation of google docs ocr, tesser- act, abbyy finereader, and transym. In International 616 Informatica 45 (2021) 605–616 P.K. Bharimalla et al. Symposium on Visual Computing, pages 735–746. Springer, 2016. https://doi.org/10.1007/ 978-3-319-50835-1_66. [17] S. Tanwar, K. Parekh, and R. Evans. Blockchain- based electronic healthcare record system for health- care 4.0 applications. Journal of Information Secu- rity and Applications, 50:102407, 2020. https:// doi.org/10.1016/j.jisa.2019.102407. [18] M. Thakkar and D. C. Davis. Risks, barriers, and benefits of ehr systems: a comparative study based on size of hospital. Perspectives in Health Infor- mation Management/AHIMA, American Health Infor- mation Management Association, 3, 2006. https: //pubmed.ncbi.nlm.nih.gov/18066363/. [19] D. Thompson, F. Velasco, D. Classen, and R. J. Raddemann. Reducing clinical costs with an ehr: in- vestments in performance management are essential to realizing the full benefits of an ehr system– including reduced costs and improved quality of care. Healthcare Financial Management, 64(10):106–112, 2010. https://link.gale.com/apps/doc/ A243277528/AONE?u=anon~b7bb66fd& sid=googleScholar&xid=4b596be9. [20] V . Varadharajan, D. Bansal, S. J. Nair, et al. Blockchain reinventing the healthcare industry: Use cases and applications. In Industry Use Cases on Blockchain Technology Applications in IoT and the Financial Sector, pages 309–328. IGI Global, 2021. https://doi.org/10.4018/ 978-1-7998-6650-3.ch013.