Extracting and parsing older style bitmap images from Lotus Notes documents


 Product:

 General Information > Bitmap Image Extraction > Version All

 Platform(s):

 All

 Edition(s):

 All

 Doc Number:

 1000092
Published 01-Oct-2016

 General Information

This support note contains guidance for extracting older bitmap images stored in $File fields in Lotus Notes documents. A sample agent has also been attached for parsing the binary data for the bitmap image.

 Solution

In older Lotus Notes documents some bitmap images are not stored directly in rich text fields but in $File items. The rich text field where the image is displayed contains a pointer to the image file (stored in the $File item) which is loaded and displayed in place on-the-fly.

The AGECOM Export Utility for Lotus Notes can extract these for you or using the information provided below you can process these yourself.

Retrieve DXL Code for the Notes Document
The first step in extracting the image is to retrieve the DXL code for the Notes document. It is beyond the scope of this article to guide you on how to do this.

Locate the bitmap pointer item
Once you have the DXL code look for the content similar to the following:

<picture height='56px' width='1134px'>
<notesbitmap>
xP8gAAEAAQAAAAgAAAAAAAAAAAAAAAAAU1RHMTU2ODKVACYAAAAAAAAAAAAAAAAAAAAAAAAAbgQ4
AAAAAAAAAAAAAAAAAA==
</notesbitmap>
</picture>

The Base64 encoded data is actually a CDSTORAGELINK object. If you decode it the raw data for the item will look similar to the following:
Äÿ ������������������STG15682•�&n8�������������

We can see from the decoded data the $File item containing the bitmap image is called 'STG15682'.

If you convert the decoded data to a byte array you can obtain the following important object properties:
  • Byte 0 - this should equal 196 which indicates this is a CDSTORAGELINK object
  • Bytes 10, 11 - this gives us the length of the data following the object. In the above example the length is 8
  • Byte 24+ - this is the actual data. The number of bytes to read was previously obtained from bytes 10-11. In the above example we would retrieve a value of 'STG15682'
Locate the $File item containing the bitmap image
You should now get a handle to the all the $FILE items in the DXL code and iterate over them. For each item retrieve the value for the 'name' attribute and check for this name in the decoded data from the previous step. Once you have a match then you know you've got the right item.

Extract the Filedata for the item
Once you've retrieved the correct $FILE item you need to retrieve the data for the image. A typical $FILE item containing a bitmap image looks like:

<item name='$FILE' summary='true' sign='true' seal='true'>
<object>
<file hosttype='cdstorage' compression='none' flags='storedindoc' encoding='none' name='STG15682' size='15102'>
<filedata>
AQD6OgEAmQAcAAAAAAAAAAAAAAAAAAAAAAAAAAAAAQAAAJUAJgAAAAAAAAAAAAAAAQAAAAAAAABu
BDgACAABAAgADgAGACQAlgCgBAAAAAAAAAAAAAAEAI4EAgUEwgDCAcQAQAIAyADCAcIAwgHDAMIB
wgDCAcIAwgFAAgHFAMIBwgDCAcoAwgHDAMIBwwDCAcYAwwHDAMIBwgDCAcQAwgHGAMMBQAICxQDD
</filedata>
</file>
</object>
</item>


The value for the 'filedata' element should be retrieved. The filedata is actually standard Lotus Notes compound data (ie. a series of CD records) which has been encoded in Base64 format.

Write the filedata to a file as binary data
The filedata is Base64 encoded so it will need to be decoded and then written to a file as binary data.

In the following sample code the encoded data from the $FILE item has been put it into a variable called FileDataContent. The code demonstrates a simple way of writing the content out to a binary file:

Dim Session As New NotesSession
Dim Db As NotesDatabase
Dim Doc As NotesDocument
Dim Stream As NotesStream
Dim FileStream As NotesStream
Dim Body As NotesRichTextItem

Session.ConvertMime = False
Set Db = Session.CurrentDatabase

Set Stream = Session.CreateStream
Call Stream.WriteText(FileDataContent)
Set Doc = Db.CreateDocument
Set Body = Doc.CreateMIMEEntity
Call Body.SetContentFromText(Stream, "", ENC_BASE64)

Set FileStream = Session.CreateStream()
FileStream.Open("c:\temp\STG15682.cd")

Call Body.GetContentAsBytes(FileStream, True)

Call Stream.Close()
Call FileStream.Close()

Session.ConvertMime = True

Parse the bitmap binary file
Now that we have the data for the bitmap written out to a binary file we can parse it using agent.

This support note has an attachment containing the code which can be used to parse the bitmap binary file. Create an agent in your Lotus Notes client and copy / paste the code in the attachment into your agent. Once you've created the agent run it and select the binary file.

ParseBitmapFile-Agent.txtParseBitmapFile-Agent.txt

After the agent has completed a text file will now be available in the same folder as the file just processed. This text file will be similarly named to the original file but will have '-debug.txt' append to it (example: STG15682-debug.txt).

If you open the text file you'll be able to see the structure of the binary file in an easy to read format.

The typical structure of the bitmap file is:
1. WORD - Number of Blocks
2. One or more WORDs for each block with the length of each block
3. WORD - TYPE_COMPOSITE Flag

Following the header are the CD Records for the bitmap. Typically the CD records are:
- Graphic
- Bitmap Header
- One or more Bitmap Segments
- Bitmap Color Table
- Bitmap Pattern Table

If the image is a jpeg it may have an ImageHeader and one or more ImageSegments.

Looking at the text file you'll see the first few lines look something like:
Number of blocks: 1
Length of block 1: 15098
TYPE_COMPOSITE flag found at byte offset: 5 (04)

Graphic (ByteOffset: 7 (06), Length: 28)

We can see the first CD Record is a Graphic and starts at the 7th byte (Hex 28) of the file. We can also see the TYPE_COMPOSITE flag is located at the 5th byte (Hex 04).

Remove bytes before the TYPE_COMPOSITE flag
Using data retrieved from the previous step we know the TYPE_COMPOSITE flag starts at the 5th byte. Using a hex editor remove the first 4 bytes of the file and then save it. Bitmap files that contain multiple blocks may also contain a TYPE_COMPOSITE flag at the beginning of each block. These TYPE_COMPOSITE flags (2 bytes) should also be removed. Note: the very first TYPE_COMPOSITE flag must be left in the file and should occupy the first 2 bytes of data.

Import the bitmap image back into Notes
Now that some of the header information has been removed we can import the image into a rich text field. The AppendRTFile method of NotesRichTextItem can do this easily for us.

Here's some sample code showing the image being appended to the Body field of the currently selected document:

Dim Session As New NotesSession
Dim Doc As NotesDocument
Dim RTItem As NotesRichTextItem

Set Doc = Session.DocumentContext

If Doc.HasItem("Body")
Set RTItem = Doc.GetFirstItem("Body")
Else
Set RTItem = New NotesRichTextItem(Doc, "Body")
End If

Call RTItem.AppendRTFile("c:\temp\STG15682.cd")
Call RTItem.Update
Call Doc.Save(True, True)

That's it.
The bitmap image should now be stored in the document as a standard inline image.

This whole process can be automated and if you're proficient with Lotus Notes DXL processing then you could go a step further to remove the $FILE item containing the bitmap and replace the <notesbitmap> item with the actual image.

To easily extract these images and other Lotus Notes content please check out the AGECOM Export Utility for Lotus Notes.



  Related Attachments


© 2016 AGE Computer Consultancy. All rights reserved.
Material may not be reproduced or distributed in any form without permission.