Technical excerpts: April 2016

Wednesday, 20 April 2016

Does the commit in called program commits transactions in calling program or main procedure?

Yes, in normal process. But if we do not want this, then we have to process called program or child program transactions as autonomoustransactions.

An autonomous transaction is an independent transaction started by another transaction, the main transaction. Autonomous transactions do SQL operations and commit or roll back, without committing or rolling back the main transaction

To start an autonomous transaction, we need to use AUTONOMOUS_TRANSACTION pragma (compiler directive) at the top of the section.

Autonomous Transactions Vs Nested transactions

By default, if the main procedure/function shares its context with nested/child procedures/functions. We can create autonomous transactions using the AUONOMOUS_TRANSACTION pragma. This will mark the nested transaction as autonomous hence this autonomous transaction will run in its own context.

Therefore, autonomous transactions

· Do not share resources such as locks with main transactions

· Do not depend on main transactions

· Its committed transactions are visible to other transactions immediately

· When an autonomous transaction calls another non-autonomous transaction, then this non-autonomous will become nested therefore this child will share same context as parent autonomous transaction

Monday, 11 April 2016

Create virtual machine in Windows 8 and windows 10 then install Linux Mint OS

Hyper-V (Hypervisor), virtual machine management services, is part of Window 8 and Windows 10 operating system. But this service is not enabled/installed by default. In order to install or enable virtual machine management services in windows 8/10, we need follow below steps

1. First we need to enable virtual technology hardware at processor level by accessing BIOS steps

BIOS is basic input output system. BIOS configuration defines operating system behavior.
In order to access BIOS, we need to restart the computer from “settings à change pc settings à update and recovery à recovery à Advanced Startup (and restart now) à troubleshoot à Advanced options à startup settings….. then when it’s being restarted, keep enter pressed
Once BIOS is open, you can enable virtual technology from configuration tab

2. Once VT is enabled from BIOS, go to add/remove programs in the control panel, and click “turn windows features on or off” and select all Hyper-V options and click on. It will install Hyper-V

3. From Hyper-V manager , you can create new virtual machine and select install OS later while creating virtual machine

4. Download linux mint iso file from https://www.linuxmint.com/download.php

5. Then start the virtual machine from Hyper-V manager and choose the ISO file for OS installation and follow the installation steps.

6. OpenSuse can be installed after downloading ISO file. But we need a network connection from virtual machine for this installation. In order to establish the network connection from virtual machine, create virtual external switch using virtual switch manger and share the external virtual switch while creating the virtual machine

Now connecting this virtual machine unix box from host putty session

1. Check virtual machine unix box ip address

Ip add show

à eth0: this is the network name in which inet ip address is the ip address we are looking for

2. Using the above ip address we can connect via putty using SSH connection type. But before this we need to install SSH in the guest linux OS (in virtual machine). We can do this using “sudo apt-get install openssh-server” command

3. Once SSH is installed, we need to open the port 22. (this can be done by editing file “sudo vi /etc/ssh/sshd_config” and un comment 22 port).. then stop and start the ssh service as “sudo /sbin/service sshd stop” and “sudo /sbin/service sshd start”

4. Check if SSH is enabled and running using the command “netstat –lnpt | grep 22”

5. You can test if SSH is working fine by running “ssh <vm ip>” in virtual machine itself

6. If any issues in order to establish SSH connection from host computer to hyper-v virtual machine, we need to configure virtual machine with static ip. To do this, open YaST control center and go to SYSTEMà NETWORK SETTINGS, then in overview tab edit the ip address as static and give some ip address. Then give this same ip address as default IP4v gateway in routing tab. This will cause internet connection not working in virtual machine, but enable SSH connection from host to virtual machine.

7. Once successfully connected using ip via SSH connection, revert the static ip to dynamic using the YaST control center. This time both internet and putty SSH connection will work.

Thursday, 7 April 2016

Some important URLs for technical library

IBM DeveloperWorks - Forum
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000825

IBM DeveloperWorks - Technical Library
http://www.ibm.com/developerworks/views/data/libraryview.jsp?type_by=All+Types&search_by=InfoSphere+DataStage

IBM Knowledge Center - DataStage and QualityStage
https://www.ibm.com/support/knowledgecenter/SSZJPZ_11.3.0/
com.ibm.swg.im.iis.ds.nav.doc/containers/cont_iisinfsrv_ds_and_qs.html

Wednesday, 6 April 2016

IBM Infosphere QualityStage simplified

IBM Infosphere server quality stage is part of information server suit and its available with additional license. Buying QualityStage will enable us to use more stages (plus match specification and rule designer, for use in these stages).. that’s it.. Everything else is same and part of DataStage..
Do not confuse with Information Analyzer and Data Rules stage is not part of QualityStage, its part of Information Analyzer.

What are the additional stages comes with QualityStage

1. Investigate

2. Standardize

3. One-source Match

4. Two-source Match

5. Match Frequency

6. Survive

7. SQA (Standardization Quality Assessment)

8. MNS (Multinational Standardization)

If we understand steps in the data quality assurance process, then understanding these stages will be easy. Data quality process steps and corresponding QualityStage Stages are

While working on any DataQuality stages, its important to understand single domain column and free form columns

Single Domain columns represent one specific business attribute… for example first name or customer id…

Free Form columns contains free text, combination of many attributes.. for example name, it could contain first name and last name or first, last and middle name… similarly address, it could contain only the house number and street name or entire address along with post code.

Analyzing single domain columns is easy because we upfront know what data the field contains. But analyzing free form fields is not that easy because we need to first set some rules expecting what free form field might contains.. i.e. to analyses the free form fields, we need to define set of rules (using rule designer). The rules work on the fact that free form field is nothing but bunch of tokens and then we map each token to a pattern, then decide output action when token is as expected (i.e. conformed to the pattern)

To standardise single domain columns, we may need rule set in some cases for example if we want to lookup correct value or spell correct or change description to acronym or we could just use modify, transformer stage to deal with nulls or change the format etc...

DataQuality Standardise stage normally used to standardise the free form column to split the free form columns into single domain columns. We need rule sets for this purpose.

Following quality stage use the Rule sets

· Investigate

· Standardize

· Multinational Standardize (MNS)

Rule set contains

· Classifications

· Output Columns

· Rules

o Rules basically define the input patterns and the actions when pattern matches. As part of this pattern matching we could use pattern specification, classifications tables, look up tables

o We can specify what action to be taken when pattern matches and what output to be written to output columns

When it comes to matching process,

Matching is nothing but comparing two records and checking if both records are duplicates are not. Then we can use survive stage to keep one record and drop other record in the duplicate pair.

Match stages take match specification as input and match specification could be created using match specification designer.

Match specification tells,

· On which keys we need to match

· What are group by columns in order to divide total records into blocks to operate on blocks first instead of operating on entire data set. Dividing entire data into blocks in very important. Because we are not trying to identify only 100% matched records. Two records match by 20% may be good enough for us to decide these records may be duplicates and need to be reviewed by someone before confirming these are duplicates. So we say 20% match are clerical match type. Therefore in order to identify which one is matching which, we need to first divide data into blocks where we are likely to find the duplicates

· How many passes we need to run the match process before finalizing the match weightage?

More details about quality stage could be found at below links

http://www.ibm.com/developerworks/data/library/techarticle/dm-1505duplicates-infosphere-qualitystage/index.html

http://bigpaathshala.blogspot.com/2014/10/investigate-stage-overview.html

Tuesday, 5 April 2016

IBM Infosphere datastage - Command Line interface commands for administration

Common administering or administration tasks performed by datastage admin

1. Managing the assets and asset migration (example exporting security assets example users and user groups)

a. ISTOOL à

istool is a command line interface. This is present in both client and engine tiers. The location if CLI is /opt/IBM/InformationServer/Clients/istools/cli

Istool is used for managing assets, i.e. build package, send or deploy package, import/export assets etc

Example:

Istool export -security '-securityUser -userident "*" -includeRoles'

-domain host:port -username user

-password password -archive ExportISFSecurityRoles.isx

2. Enable or disable product features or set/unset the product configuration properties

a. IISADMIN à

Use iisadmin can be used to activate or deactivate an edition or feature pack. More details about iisadmin could found in below link

https://www.ibm.com/support/knowledgecenter/SSZJPZ_11.5.0/com.ibm.swg.im.iis.found.admin.common.doc/topics/r_iisadmincommand.html

for example, if we want to see what features are active then we could use below command

./iisAdmin.sh -display
-key com.ibm.iis.datastage.license.*

Results..

com.ibm.iis.datastage.license.option.parallel=1

com.ibm.iis.datastage.license.option.qualitystage=0

above results indicate
that qualitystage is not enabled.

3. Administering services tier i.e. starting or stopping services

a. Starting services

i. First start the application server

1. Go to /opt/IBM/InformationServer/ASBServer/bin

2. Run “./MetadataServer.sh start” or “./MetadataServer.sh run”

3. run echos the output (i.e. runs in forground) and start runs in background

ii. Then start engine service

1. Go to /opt/IBM/InformationServer/Server/DSEngine

2. ./bin/uv -admin -start

iii. Then start ASB Agent

1. Go to /opt/IBM/InformationServer/ASBNode/bin

2. ./NodeAgents.sh start

b. Shutting down or stopping services

i. First stop the datastage engine services (Metadata Server services, DSRPC Server etc)

1. Go to /opt/IBM/InformationServer/Server/DSEngine

2. ./bin/uv -admin –stop

ii. Then stop agents

1. Go to /opt/IBM/InformationServer/ASBNode/bin

2. ./NodeAgents.sh stop

iii. Then stop application server

1. Go to /opt/IBM/InformationServer/ASBServer/bin

2. ./MetadataServer.sh stop

4. AppServerAdmin Command

a. Run this command whenever admin user account password is changed so that new password is reflected across all the Information Server suite components configuration

5. SessionAdmin command

a. Use this to manage and monitor the active sessions

6. DirectoryAdmin tool

a. Use this to access metadata repository and user registry, and complete variety of actions on user registry including adding new user, changing password, deleting user/group, changing the user role etc..

7. DirectoryCommand tool

a. Similar to directoryadmin but not same. This can be used to add/delete users/groups etc

8. Encrypt

a. Use this command to encrypt the user credentials

9. Orchadmin

a. Use this command to research on dataset.
ORCHADMIN CHECK -- Check configuration file

COPY, DELETE, DESCRIBE and TRUNCATE datasets

DUMP dataset data to readable format files

10. What is ASB agent

a. This is the back group process that conveys the requests from client tier to service tier.

11. What is dsrpcd (DSRPC Service)

a. This service allows client tier to connect to server engine.

12. Another very important command is “dsjob”, though administrator do not normally use this command, its very important to know and understand how dsjob works.

Monday, 4 April 2016

Unix Interview questions

1. How do we kill a process group and all children of that process group with single command?

Ans: If we want to kill a process group, you can supply negation of group number to kill as “kill –term -1234”, where 1234 is process group id i.e. PGID. If we just use ps -ef, we will not get PGID, instead we get PPID and PPID can not be used to kill process group. To identify the PGID, use PS with -o option.

2. Using the regular expression in grep, how do we retrieve all the lines which have A as 6th character?

Ans : grep “^.\{5\}A"

Explanation: .(dot) will tell grep that 1 or more characters and \{5\} tells that total 5 characters. Then A is 6th character.

3. How do we debug the unix script?

Ans: use –x while running script using ksh or code “set –x” in the script to print all the lines in script

4. Set –x prints commands after variable substitution. How do you print or echo commands before variable substitution?

Ans: set -v

5. If we want to refer a variable or input parameter of parent process in the child process, how do we call the child process?

Ans: Child process within unix script could be invoked as internal process or external process. If we want to refer input parameters or variables of parent process in the child process, then we need to invoke the child process as internal process. (this can be done as “. Script_full_path” )

For variable usage in the internal process, we need to export the variable.

Input parameters of parent process could be used in the internal child process same way as they used in parent process (i.e. using $1, $2 etc)

6. How do we know what all processes currently running in the system?

Ans: using ps command.

7. How could we print 11th line in a file containing 20 records?

Ans: we could use either HEAD and TAIL combination or SED

Using HEAD and TAIL : head -11 file_name | tail -1

Using SED : sed –n ’11 p’ file_name

8. How do we print first 3 words in a record?

Ans: we could use awk, cut to print words treating each word as a field separated by space.

Using cut: cut –f1-3 –d’ ‘ file_name

Using awk: awk –FS=” “ ‘{print $1,$2,$3}’ file_name

9. How do we erase all files in the current directory, including all its sub-directories, using only one command?

Ans: rm –r *
We have another command called "shred" used to overwrite a file number of times (default 25) to make it unrecoverable, not exactly same as rm. Files removed using rm also not recoverable , but file remains at disk (only inode link removed using rm) so sophisticated recoverable 3rd party tool could read therefore shred must be used to shred sensitive information.

10. How do we find out history of all commands executed?

Ans : using history command

11. Differentiate cmp command from diff command?

Ans: The cmp command is used to find out if both file are same (byte by byte) or not. The diff command is used to indicate the changes that is to be made in order to make the two files identical to each other.

12. If I am owner of a file and I want to change the file owner, how do I do that?

Ans: Only super user or root user can execute change owner command. The file owners can’t change the owner of a file.

In order to change the file owner or group, we can use CHOWN and CHGRP commands.

13. If I want to echo the data to standard output as well as write to a file in single command, how do I do that?

Ans: using tee command

14. How do mark end of the command in unix script?

Ans: Normally we use enter or return key to end the line. This end of line character act as signpost. But we could also use ; (semicolon) as command end mark.

15. What is inode and how do we list them?

Ans: Unix file is stored into two different parts of the disk. Data Blocks and INODEs

Data blocks contain the actual data

Directories are tables that contain the link between the file name and inode

Inode contains the file related information like owner , permissions, size, last accessed, last updated, number of hard links

To list inodes we can use –I option in ls. For example “ls –i” will list all Inodes

16. what is difference between whois and finger?

ans: whois gives basic information about user logged in. finger give more details/personal information about the users who are not even logged in. finger could also be used to get information about the remote host users.

17. What is PS1?

Ans: command prompt string is stored in PS1. We can reset prompt string by changing the PS1 variable value

18. What are the different unix standard streams?

Ans: stdin (file descriptor 0), stdout (file descriptor 1) and stderr (file descriptor 2)

Echo “abc “ > /dev/null 2>&1 will print nothing because

standard output with file descriptor 1 is directed to /dev/null/ (null or blackhole in disk) and standard error with file descriptor 2 is redirected to file descriptor 1 that is further redirected to /dev/null/

19. Difference between command piping and grouping?

Ans: piping is the way of using output from one command as input to other command

Grouping is grouping multiple commands together and printing combined output

Piping Examples: ls -al | more -- this print one page of listing and pause and print more on enter

Grouping example: (date; cal; date) > out.txt -- writes all date, calender information to out.txt file

20. What is xargs and how its used?

Ans:

If you combine two commands with | without using xargs, the first command output will be used as input data to second command, i.e. second command operate on first command output

But with xargs, first command output will be used as input parameter to second command i.e. second command will operate using first command output

By default xargs runs the command /bin/echo. We can tell xargs to run the command we want.

For example

Find . –name *.sh | grep “CREATE TABLE”

This command will list all the .sh file and grep if any of the .sh file has name containing “CREATE TABLE”..

Find . –name *.sh | xargs grep “CREATE TABLE”

This command will list all the .sh file and grep if each and every .sh file if the content of the file has “CREATE TABLE” string

21. How do we run the jobs as background task ?

Ans: using &

If we want to capture stdout then we can use nohup
background job can be brought foreground using "fg %jobnum", get jobnum using "jobs" command
cntrl+z can be used to interrupt a job and stop temporary and resume job using fg or bg commands (These commands are part of job monitor functions)

22. How do we schedule a job in unix?

Ans: using crontab or "at"

Edit the crontab (crontab –e ) then add an entry

23. Can I specify optional arguments for a script and if yes, how do we process optional arguments?

Ans: yes, we can specify using getopts

The getopts utility can be used to retrieve options and option-arguments from a list of parameters

Optional input parameters should be prefixed with -.

For example when running any unix command we give optional parameters right … for rm command, the positional parameter is file name and optional parameters are –f, -r etc..

We need to use while command to process all optional parameters.. example

While getopts ….alloptions, ex: abcdc ….. name

Case $name in

esac

done

Search This Blog