Skip to content

Added Support for BytesIO as alternate to sending filename and added support to accept a previously processed image data to avoid reprocessing#187

Open
hemanshukale wants to merge 2 commits into
reingart:masterfrom
hemanshukale:master

Conversation

@hemanshukale
Copy link
Copy Markdown

BytesIO instance can now be sent instead of filename
Image data, once processed, can be stored and reused, even in separate instances of FPDF

Please find more details and code example in subsequent comments

…mp data is stored/processed. Check comments for detailed info
…y reprocessing of images. Hash of data dict can also be used to refer to a processed image. Check comments for detailed info
@hemanshukale
Copy link
Copy Markdown
Author

BytesIO addition (bd19403)

I was working on images created within a script itself hence needed to pass images without storing in fs,
hence added BytesIO support to pass the image without saving

Usage:

f=FPDF()
f.add_page()
buffer = io.BytesIO()
<PIL.Image instance>.save(buffer, format='PNG') # or format='JPEG'
ioBuffer = io.BytesIO(buf.getvalue())
f.image(ioBuffer,x,y,w,h,type="png",sub_type='tounique') # new subtype will make sure this instance is stored 

Reasoning for adding sub_type=tounique

There is a dictionary FPDF.images which stores the processed data with the name taken from name argument of FPDF.image() as key.
If you pass BytesIO instance in the parameter, its address can get repeated, and the script will see it as same data sent again,
so will not process the new data. Adding tounique as param will append the index number to the name param to make a unique key

@hemanshukale
Copy link
Copy Markdown
Author

Reusing processed image data (a35c701)

In another use case, I needed to put same image multiple times in multiple pdfs. If I used the BytesIO instace / filename,
it will be reused for an instance of FPDF and not reprocessed for that instance, however it will be processed once per new PDF.
To avoid this reprocessing. we can store the dictionary made by processing data once and send this to FPDF.image() every other time.

Usage :

f=FPDF()
processedDict = f._parsepng(ioBuffer) # this function will return the dict
processedDictHash = FPDF.s256(processedDict) # get the hash of the processed dict data
 
First=True # This will represent if this is first time the image function is called
for (condition):
    if First:
        f.image(processedDict,x,y,w,h,type='png',sub_type='tohash') # First time you send the whole processed dict 
        First=False
    else:
        f.image(processedDictHash,x,y,w,h,type='png') # every other time you just send hash of the processed dict

Reasoning for adding sub_type=tohash

However this might need sending the same dict (size of which can go in MBs) and the FPDF.imge() function will always check is this is already present in the dict.
Hence, when the argument tohash is passed, the key used for storing this processed data will be made from hash of the processed dictionary. So every time we need to use the same image,(except first time per FPDF instance) we can just send hash of the dict and the script will compare only that

If in case you face some random issues, you can try sending a copy.deepcopy of the processedDict or processedDictHash instead of the original variable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant