Thursday, June 4, 2009

(FAQ) Socket Programming Simple Client Program

Socket API Calls to Create Simple Client Program
(Source: Scott Klement's Socket Programming)

Topics:
1. The socket() API call
2. The connect() API call
3. The send() and recv() API calls
4. Translating from ASCII to EBCDIC
5. The close() API call
6. Simple client program


1. The socket() API call

The previous sections explained how to find out the port number for a service name, and how to get the IP address for a host name. This section will utilize that information to create a simple client program.

The socket() API is used to create a socket. You can think of a socket as being the virtual device that is used to read & write data from a network connection.

The IBM manual page that documents the socket API is at this link: http://publib.boulder.ibm.com/pubs/html/as400/v4r5/ic2924/info/apis/socket.htm

It lists the following as the prototype for the socket() API:

int socket(int address_family,
int type,
int protocol)


This tells us that the name of the procedure to call is 'socket', and that it accepts 3 parameters. Each one is an integer, passed by value. It also returns an integer. Therefore, the RPG prototype for the socket() API looks like this:

D socket PR 10I 0 ExtProc('socket')
D addr_family 10I 0 value
D type 10I 0 value
D protocol 10I 0 value


It's important to realize that the socket APIs can be used for other networking protocols besides TCP/IP. When we create a socket, we need to explain to the socket API that we wish to communicate using the IP protocol, and that we wish to use TCP on top of the IP protocol.

For address family, the manual tells us that we need to specify a value of 'AF_INET' if we wish to do network programming in the 'Internet domain'. Therefore, when we specify a value of 'AF_INET', what we're really telling the API is to 'use the IP protocol'.

Under the 'type' parameter, it allows us to give values of 'SOCK_DGRAM', 'SOCK_SEQPACKET', 'SOCK_STREAM' or 'SOCK_RAW'. The TCP protocol is the standard streaming protocol for use over IP. So, if we say 'SOCK_STREAM', we'll use the TCP protocol. As you might imagine, SOCK_DGRAM is used for the UDP protocol and SOCK_RAW is used for writing raw IP datagrams.

Finally, we specify which protocol we wish to use with our socket. Note that, again, we can specify IPPROTO_TCP for TCP, IPPROTO_UDP for UDP, etc. However, this isn't necessary! Because we already specified that we wanted a 'stream socket' over 'internet domain', it already knows that it should be using TCP. Therefore, we can specify 'IPPROTO_IP' if we want, and the API will use the default protocol for the socket type.

Now, we just have one problem: We don't know what integer values AF_INET, SOCK_STREAM and IPPPROTO_IP are! IBM is referencing named constants that they've defined in the appropriate header files for C programs, but we don't have these defined for us in RPG! But, if you do a bit of snooping into the 'System Openness Includes' library, you'll find that AF_INET is defined to be '2', SOCK_STREAM is defined to be '1', and IPPROTO_IP is defined as '0'. To make this easier for us, we'll make named constants that match these values, like so:

D AF_INET C CONST(2)
D SOCK_STREAM C CONST(1)
D IPPROTO_IP C CONST(0)


Now we can call the socket() API like so:

c eval s = socket(AF_INET:SOCK_STREAM:IPPROTO_IP)


2. The connect() API call

Once we have a socket to work with, we need to connect it to something. We do that using the connect() API, which is documented in IBM's manual at this location: http://publib.boulder.ibm.com/pubs/html/as400/v4r5/ic2924/info/apis/connec.htm

It tells us here that the prototype for the connect() API looks like this:

int connect(int socket_descriptor,
struct sockaddr *destination_address,
int address_length)


So, as you can see, the procedure is named 'connect', and it accepts 3 parameters. An integer, a pointer to a 'sockaddr' structure, and another integer. It also returns an integer. This means that the RPG prototype will look like this:

D connect PR 10I 0 ExtProc('connect')
D sock_desc 10I 0 value
D dest_addr * value
D addr_len 10I 0 value


Looking further down the manual, we see that the a 'sockaddr' structure is defined as follows:

struct sockaddr {
u_short sa_family;
char sa_data[14];
};


Remember, the purpose of this structure is to tell the API which IP address and port number to connect to. Why, then, doesn't it contain fields that we can put the address and port numbers into? Again, we have to remember that the socket APIs can work with many different network protocols. Each protocol has a completely different format for how addresses work. This 'sockaddr' structure is, therefore, a generic structure. It contains a place to put the identifying address family, along with a generic "data" field that the address can be placed in, regardless of the format of the address.

Although it's not documented on IBM's page for the connect() API, there is actually a different structure called 'sockaddr_in' which is designed especially for internet addresses. The C definition for sockaddr_in can be found in the file QSYSINC/NETINET, member IN, if you have the System Openness Includes loaded. It looks like this:

struct sockaddr_in { /* socket address (internet) */
short sin_family; /* address family (AF_INET) */
u_short sin_port; /* port number */
struct in_addr sin_addr; /* IP address */
char sin_zero[8]; /* reserved - must be 0x00's */
};


To make it easier to use these structures in RPG, I like to make them based in the same area of memory. This means that you can look at the data as a 'sockaddr', or the same data as a 'sockaddr_in' without moving the data around. Having said that, here's the definition that I use for the sockaddr & sockaddr_in structures:

D p_sockaddr S *
D sockaddr DS based(p_sockaddr)
D sa_family 5I 0
D sa_data 14A
D sockaddr_in DS based(p_sockaddr)
D sin_family 5I 0
D sin_port 5U 0
D sin_addr 10U 0
D sin_zero 8A


Before we can call the connect() API, we need to ask the operating system for some memory that we can store our sockaddr structure into. Then, we can populate the sockaddr_in structure, and actually call the connect() API. Like so:

DName+++++++++++ETDsFrom+++To/L+++IDc.Keywords+++++++++++++++++++
D p_connto S *
D addrlen S 10I 0

C* Ask the operating system for some memory to store our socket
C* address into:
C eval addrlen = %size(sockaddr)
C alloc addrlen p_connto

C* Point the socket address structure at the newly allocated
C* area of memory:
C eval p_sockaddr = p_connto

C* Populate the sockaddr_in structure
C* Note that IP is the ip address we previously looked up
C* using the inet_addr and/or gethostbyname APIs
C* and port is the port number that we looked up using the
C* getservbyname API.
C eval sin_family = AF_INET
C eval sin_addr = IP
C eval sin_port = PORT
C eval sin_zero = *ALLx'00'

C* Finally, we can connect to a server:
C if connect(s: p_connto: addrlen) < 0
C*** Connect failed, report error here
C endif


3. The send() and recv() API calls

Once we've made a connection, we'll want to use that connection to send and receive data across the network. We'll do that using the send() and recv() APIs.

IBM's manual page for the send() API can be found at this link: http://publib.boulder.ibm.com/pubs/html/as400/v4r5/ic2924/info/apis/send.htm

It tells us that the prototype for the send() API looks like this:

int send(int socket_descriptor,
char *buffer,
int buffer_length,
int flags)


Yes, the procedure is called 'send', and it accepts 4 parameters. Those parameters are an integer, a pointer, an integer and another integer. The send() API also returns an integer. Therefore, the RPG prototype for this API is:

D send PR 10I 0 ExtProc('send')
D sock_desc 10I 0 value
D buffer * value
D buffer_len 10I 0 value
D flags 10I 0 value


You may have noticed that for other 'char *' definitions, we put the 'options(*string)' keyword in our D-specs, but we didn't this time. Why? Because the send() API doesn't use a trailing null-character to determine the end of the data to send. Instead, it uses the buffer_length parameter to determine how much data to send.

That is a useful feature to us, because it means that we are able to transmit the null-character over the network connection as well as the rest of the string, if we so desire.

The flags parameter is used for 'out of band data', and for sending 'non-routed' data. You'll almost never use these flags. Why? Because 'out of band data' has never been widely adopted. Many TCP/IP stacks don't even implement it properly. In fact, for a long time, sending 'out-of-band' data to a Windows machine caused it to crash. The popular program called 'winnuke' does nothing more than send some out-of-band data to a Windows machine. The other flag, 'dont route' is really only used when writing routing applications. In all other situations, you want your packets to be routed! Therefore, it's very rare for us to specify anything but a 0 in the flags parameter.

The return value of the send() API will be the number of bytes sent, or a negative number if an error occurred.

Consequently, we typically call the send() API like this:

D miscdata S 25A
D rc S 10I 0

C eval miscdata = 'The data to send goes here'
C eval rc = send(s: %addr(miscdata): 25: 0)
c if rc < 25
C* for some reason we weren't able to send all 25 bytes!
C endif


The recv() API is used to receive data over the network. IBM has documented this API here: http://publib.boulder.ibm.com/pubs/html/as400/v4r5/ic2924/info/apis/recv.htm

recv() is very similar to send(). In fact, the prototype for recv is nearly identical to send(), the only difference is the name of the procedure that you call. The prototype looks like this:

int recv(int socket_descriptor,
char *buffer,
int buffer_length,
int flags)


And, just like send, the RPG prototype looks like this:

D recv PR 10I 0 ExtProc('recv')
D sock_desc 10I 0 value
D buffer * value
D buffer_len 10I 0 value
D flags 10I 0 value


The obvious difference between send() and recv() is what the system does with the memory pointed to by the 'buffer' parameter. When using send(), the data in the buffer is written out to the network. When using recv(), data is read from the network and is written to the buffer.

Another, less obvious, difference is how much data gets processed on each call to these APIs. By default, when you call the send() API, the API call won't return control to your program until the entire buffer has been written out to the network. By contrast, the recv() API will receive all of the data that's currently waiting for your application.

By default, recv() will always wait for at least one byte to be received. But, if there are more bytes, it will return them all, up to the length of the buffer that you've requested.

In the send() example above, 25 bytes are always written to the network unless an error has occurred. In the recv() example below, we can receive anywhere from 1 to 25 bytes of data. We have to check the return code of the recv() API to see how much we actually received.

Here's a quick example of calling recv():

D miscdata S 25A
D rc S 10I 0

C eval rc = recv(s: %addr(miscdata): 25: 0)
c if rc < 1
C* Something is wrong, we didnt receive anything.
C endif


4. Translating from ASCII to EBCDIC

Almost all network communications use the ASCII character set, but the AS/400 natively uses the EBCDIC character set. Clearly, once we're sending and receiving data over the network, we'll need to be able to translate between the two.

There are many different ways to translate between ASCII and EBCDIC. The API that we'll use to do this is called QDCXLATE, and you can find it in IBM's information center at the following link: http://publib.boulder.ibm.com/pubs/html/as400/v4r5/ic2924/info/apis/QDCXLATE.htm

There are other APIs that can be used to do these conversions. In particular, the iconv() set of APIs does really a good job, however, QDCXLATE is the easiest to use, and will work just fine for our purposes.

The QDCXLATE API takes the following parameters:
Parm# Description Usage Data Type
1 Length of data to convert Input Packed (5,0)
2 Data to convert I/O Char (*)
3 Conversion table Input Char (10)

And, since QDCXLATE is an OPM API, we actually call it as a program. Traditionally, you'd call an OPM API with the RPG 'CALL' statement, like this:

C CALL 'QDCXLATE'
C PARM 128 LENGTH 5 0
C PARM DATA 128
C PARM 'QTCPEBC' TABLE 10


However, I find it easier to code program calls using prototypes, just as I use for procedure calls. So, when I call QDCXLATE, I will use the following syntax:

D Translate PR ExtPgm('QDCXLATE')
D Length 5P 0 const
D Data 32766A options(*varsize)
D Table 10A const

C callp Translate(128: Data: 'QTCPEBC')


There are certain advantages to using the prototyped call. The first, and most obvious, is that each time we want to call the program, we can do it in one line of code. The next is that the 'const' keyword allows the compiler to automatically convert expressions or numeric variables to the data type required by the call. Finally, the prototype allows the compiler to do more thorough syntax checking when calling the procedure.

There are two tables that we will use in our examples, QTCPASC and QTCPEBC. These tables are easy to remember if we just keep in mind that the table name specifies the character set that we want to translate the data into. In other words 'QTCPEBC' is the IBM-supplied table for translating TCP to EBCDIC (from ASCII) and QTCPASC is the IBM supplied table for translating TCP data to ASCII (from EBCDIC).


5. The close() API call

This section documents the easiest, by far, socket API to call. The close() API. This API is used to disconnect a connected socket, and destroy the socket descriptor. (In other words, to use the socket again after calling close() you have to call socket() again).

Here's the IBM manual page that describes the close() API:< http://publib.boulder.ibm.com/pubs/html/as400/v4r5/ic2924/info/apis/close.htm

The manual tells us that the prototype for close() looks like this:

int close(int fildes)


So, the procedure's name is 'close' and it accepts one parameter, an integer. It also returns an integer. The RPG prototype looks like this:

D close PR 10I 0 ExtProc('close')
D sock_desc 10I 0 value


To call it, we can simply to:

C eval rc = close(s)
C if rc < 0
C*** Socket didn't close. Now what?
c endif


Or, more commonly (because there isn't much we can do if close() fails) we do something like this:

C callp close(s)


Too easy to be a UNIX-Type API, right? Well, never fear, there's one complication. The system uses the same close() API for closing sockets that it uses for closing files in the integrated file system.

This means that if you use both sockets and IFS reads/writes in your program, that you only need to define one prototype for close(). Handy, right? Unfortunately, most people put all of the definitions needed for socket APIs into one source member that they can /COPY into all of the programs that need it. Likewise, the IFS prototypes and other definitions are put into their own /COPY member.

When you try to use both /COPY members in the same program, you end up with a duplicate definition of the close() API, causing the compiler to become unhappy.

The solution is relatively simple... When we make a header file for either sockets or IFS, we use the RPG /define and /if defined directives to force it to only include the close() prototype once. So, our prototype will usually look like this:

D/if not defined(CLOSE_PROTOTYPE)
D close PR 10I 0 ExtProc('close')
D sock_desc 10I 0 value
D/define CLOSE_PROTOTYPE
D/endif


6. Our first client program

We've learned a lot of new API calls over the past few sections. It's time to put these new APIs to use with an example program.

This program is a very simple http client. It connects to a web server on the internet, and requests that a web page (or another file) on the server be sent back to it. It then receives the data that the web server returns and displays it on the screen.

It's important to understand that most data that is sent or received over the internet uses the concept of 'lines of text.' A line of text is a variable-length string of bytes, you can tell the end of a line by looking for the 'carriage-return' and 'line-feed' characters. When these characters appear in the text, it means that its time to start a new line.

In ASCII, the 'carriage-return' character (CR) is x'0D', and the 'line-feed' character is x'0A'. When translated to EBCDIC, these are x'0D' and x'25', respectively.

Therefore, the pseudocode for this client program looks like this:

1. Look up the port number for the HTTP service and store it into the variable 'port'.
2. Look up the IP address for the hostname that was passed as a parameter, and store it in the variable 'IP'.
3. Call the socket() API to create a socket that we can use for communicating with the HTTP server.
4. Create a socket address structure ('sockaddr') that will tell the connect() API which host & service to connect to.
5. Call the connect() API to connect to the HTTP server.
6. Place a two-line 'request' into a variable. The first line will contain the phrase "GET /pathname/filename HTTP/1.0" which tells the HTTP server that we wish to get a file from it, and also tells the HTTP server where that file is. The "HTTP/1.0" means that we're using version 1.0 of the HTTP specifications (more about that later) The second line of the request is blank, that's how we tell the server that we're done sending requests.
7. Translate our request to ASCII so the server will understand it.
8. Call the send() API to send our request to the server.
9. Call the recv() API to read back 1 byte of the server's reply.
10. If an error occurred (that is, the server disconnected us) then we're done receiving, jump ahead to step 13.
11. If the byte that we've read is not the 'end-of-line' character, and our receive buffer isn't full, then add the byte to the end of the receive buffer, and go to step 9.
12. Translate the receive buffer to EBCDIC so we can read it, and display the receive buffer. Then go back to step 9 to get the next line of data.
13. Close the connection.
14. Pause the screen so the user can see what we received before the program ends.

Without further ado, here's the sample program, utilizing all of the concepts from the past few sections of this tutorial:

File: SOCKTUT/QRPGLESRC, Member: CLIENTEX1

H DFTACTGRP(*NO) ACTGRP(*NEW)

D getservbyname PR * ExtProc('getservbyname')
D service_name * value options(*string)
D protocol_name * value options(*string)

D p_servent S *
D servent DS based(p_servent)
D s_name *
D s_aliases *
D s_port 10I 0
D s_proto *

D inet_addr PR 10U 0 ExtProc('inet_addr')
D address_str * value options(*string)

D INADDR_NONE C CONST(4294967295)

D inet_ntoa PR * ExtProc('inet_ntoa')
D internet_addr 10U 0 value

D p_hostent S *
D hostent DS Based(p_hostent)
D h_name *
D h_aliases *
D h_addrtype 10I 0
D h_length 10I 0
D h_addr_list *
D p_h_addr S * Based(h_addr_list)
D h_addr S 10U 0 Based(p_h_addr)

D gethostbyname PR * extproc('gethostbyname')
D host_name * value options(*string)

D socket PR 10I 0 ExtProc('socket')
D addr_family 10I 0 value
D type 10I 0 value
D protocol 10I 0 value

D AF_INET C CONST(2)
D SOCK_STREAM C CONST(1)
D IPPROTO_IP C CONST(0)

D connect PR 10I 0 ExtProc('connect')
D sock_desc 10I 0 value
D dest_addr * value
D addr_len 10I 0 value

D p_sockaddr S *
D sockaddr DS based(p_sockaddr)
D sa_family 5I 0
D sa_data 14A
D sockaddr_in DS based(p_sockaddr)
D sin_family 5I 0
D sin_port 5U 0
D sin_addr 10U 0
D sin_zero 8A

D send PR 10I 0 ExtProc('send')
D sock_desc 10I 0 value
D buffer * value
D buffer_len 10I 0 value
D flags 10I 0 value

D recv PR 10I 0 ExtProc('recv')
D sock_desc 10I 0 value
D buffer * value
D buffer_len 10I 0 value
D flags 10I 0 value

D close PR 10I 0 ExtProc('close')
D sock_desc 10I 0 value

D translate PR ExtPgm('QDCXLATE')
D length 5P 0 const
D data 32766A options(*varsize)
D table 10A const

D msg S 50A
D sock S 10I 0
D port S 5U 0
D addrlen S 10I 0
D ch S 1A
D host s 32A
D file s 32A
D IP s 10U 0
D p_Connto S *
D RC S 10I 0
D Request S 60A
D ReqLen S 10I 0
D RecBuf S 50A
D RecLen S 10I 0

C*************************************************
C* The user will supply a hostname and file
C* name as parameters to our program...
C*************************************************
c *entry plist
c parm host
c parm file

c eval *inlr = *on

C*************************************************
C* what port is the http service located on?
C*************************************************
c eval p_servent = getservbyname('http':'tcp')
c if p_servent = *NULL
c eval msg = 'Can''t find the http service!'
c dsply msg
c return
c endif

c eval port = s_port

C*************************************************
C* Get the 32-bit network IP address for the host
C* that was supplied by the user:
C*************************************************
c eval IP = inet_addr(%trim(host))
c if IP = INADDR_NONE
c eval p_hostent = gethostbyname(%trim(host))
c if p_hostent = *NULL
c eval msg = 'Unable to find that host!'
c dsply msg
c return
c endif
c eval IP = h_addr
c endif

C*************************************************
C* Create a socket
C*************************************************
c eval sock = socket(AF_INET: SOCK_STREAM:
c IPPROTO_IP)
c if sock < 0
c eval msg = 'Error calling socket()!'
c dsply msg
c return
c endif

C*************************************************
C* Create a socket address structure that
C* describes the host & port we wanted to
C* connect to
C*************************************************
c eval addrlen = %size(sockaddr)
c alloc addrlen p_connto

c eval p_sockaddr = p_connto
c eval sin_family = AF_INET
c eval sin_addr = IP
c eval sin_port = port
c eval sin_zero = *ALLx'00'

C*************************************************
C* Connect to the requested host
C*************************************************
C if connect(sock: p_connto: addrlen) < 0
c eval msg = 'unable to connect to server!'
c dsply msg
c callp close(sock)
c return
c endif

C*************************************************
C* Format a request for the file that we'd like
C* the http server to send us:
C*************************************************
c eval request = 'GET ' + %trim(file) +
c ' HTTP/1.0' + x'0D25' + x'0D25'
c eval reqlen = %len(%trim(request))
c callp Translate(reqlen: request: 'QTCPASC')

C*************************************************
c* Send the request to the http server
C*************************************************
c eval rc = send(sock: %addr(request): reqlen:0)
c if rc < reqlen
c eval Msg = 'Unable to send entire request!'
c dsply msg
c callp close(sock)
c return
c endif

C*************************************************
C* Get back the server's response
C*************************************************
c dou rc < 1
C exsr DsplyLine
c enddo

C*************************************************
C* We're done, so close the socket.
C* do a dsply with input to pause the display
C* and then end the program
C*************************************************
c callp close(sock)
c dsply pause 1
c return

C*===============================================================
C* This subroutine receives one line of text from a server and
C* displays it on the screen using the DSPLY op-code
C*===============================================================
CSR DsplyLine begsr
C*------------------------
C*************************************************
C* Receive one line of text from the HTTP server.
C* note that "lines of text" vary in length,
C* but always end with the ASCII values for CR
C* and LF. CR = x'0D' and LF = x'0A'
C*
C* The easiest way for us to work with this data
C* is to receive it one byte at a time until we
C* get the LF character. Each time we receive
C* a byte, we add it to our receive buffer.
C*************************************************
c eval reclen = 0
c eval recbuf = *blanks

c dou reclen = 50 or ch = x'0A'
c eval rc = recv(sock: %addr(ch): 1: 0)
c if rc < 1
c leave
c endif
c if ch<>x'0D' and ch<>x'0A'
c eval reclen = reclen + 1
c eval %subst(recbuf:reclen:1) = ch
c endif
c enddo

C*************************************************
C* translate the line of text into EBCDIC
C* (to make it readable) and display it
C*************************************************
c if reclen > 0
c callp Translate(reclen: recbuf: 'QTCPEBC')
c endif
c recbuf dsply
C*------------------------
Csr endsr


Compile this program with: CRTBNDRPG PGM(CLIENTEX1) SRCFILE(xxx/QRPGLESRC) DBGVIEW(*LIST)

Run the program by typing: CALL CLIENTEX1 PARM('ods.ods.net' '/index.html')

(You should be able to use this to retrieve just about any web page)


No comments:

Post a Comment