Atoms Crowd  7.0.0
stb_image.h
1 /* stb_image - v2.19 - public domain image loader - http://nothings.org/stb
2 no warranty implied; use at your own risk
3 
4 Do this:
5 #define STB_IMAGE_IMPLEMENTATION
6 before you include this file in *one* C or C++ file to create the implementation.
7 
8 // i.e. it should look like this:
9 #include ...
10 #include ...
11 #include ...
12 #define STB_IMAGE_IMPLEMENTATION
13 #include "stb_image.h"
14 
15 You can #define STBI_ASSERT(x) before the #include to avoid using assert.h.
16 And #define STBI_MALLOC, STBI_REALLOC, and STBI_FREE to avoid using malloc,realloc,free
17 
18 
19 QUICK NOTES:
20 Primarily of interest to game developers and other people who can
21 avoid problematic images and only need the trivial interface
22 
23 JPEG baseline & progressive (12 bpc/arithmetic not supported, same as stock IJG lib)
24 PNG 1/2/4/8/16-bit-per-channel
25 
26 TGA (not sure what subset, if a subset)
27 BMP non-1bpp, non-RLE
28 PSD (composited view only, no extra channels, 8/16 bit-per-channel)
29 
30 GIF (*comp always reports as 4-channel)
31 HDR (radiance rgbE format)
32 PIC (Softimage PIC)
33 PNM (PPM and PGM binary only)
34 
35 Animated GIF still needs a proper API, but here's one way to do it:
36 http://gist.github.com/urraka/685d9a6340b26b830d49
37 
38 - decode from memory or through FILE (define STBI_NO_STDIO to remove code)
39 - decode from arbitrary I/O callbacks
40 - SIMD acceleration on x86/x64 (SSE2) and ARM (NEON)
41 
42 Full documentation under "DOCUMENTATION" below.
43 
44 
45 LICENSE
46 
47 See end of file for license information.
48 
49 RECENT REVISION HISTORY:
50 
51 2.19 (2018-02-11) fix warning
52 2.18 (2018-01-30) fix warnings
53 2.17 (2018-01-29) bugfix, 1-bit BMP, 16-bitness query, fix warnings
54 2.16 (2017-07-23) all functions have 16-bit variants; optimizations; bugfixes
55 2.15 (2017-03-18) fix png-1,2,4; all Imagenet JPGs; no runtime SSE detection on GCC
56 2.14 (2017-03-03) remove deprecated STBI_JPEG_OLD; fixes for Imagenet JPGs
57 2.13 (2016-12-04) experimental 16-bit API, only for PNG so far; fixes
58 2.12 (2016-04-02) fix typo in 2.11 PSD fix that caused crashes
59 2.11 (2016-04-02) 16-bit PNGS; enable SSE2 in non-gcc x64
60 RGB-format JPEG; remove white matting in PSD;
61 allocate large structures on the stack;
62 correct channel count for PNG & BMP
63 2.10 (2016-01-22) avoid warning introduced in 2.09
64 2.09 (2016-01-16) 16-bit TGA; comments in PNM files; STBI_REALLOC_SIZED
65 
66 See end of file for full revision history.
67 
68 
69 ============================ Contributors =========================
70 
71 Image formats Extensions, features
72 Sean Barrett (jpeg, png, bmp) Jetro Lauha (stbi_info)
73 Nicolas Schulz (hdr, psd) Martin "SpartanJ" Golini (stbi_info)
74 Jonathan Dummer (tga) James "moose2000" Brown (iPhone PNG)
75 Jean-Marc Lienher (gif) Ben "Disch" Wenger (io callbacks)
76 Tom Seddon (pic) Omar Cornut (1/2/4-bit PNG)
77 Thatcher Ulrich (psd) Nicolas Guillemot (vertical flip)
78 Ken Miller (pgm, ppm) Richard Mitton (16-bit PSD)
79 github:urraka (animated gif) Junggon Kim (PNM comments)
80 Christopher Forseth (animated gif) Daniel Gibson (16-bit TGA)
81 socks-the-fox (16-bit PNG)
82 Jeremy Sawicki (handle all ImageNet JPGs)
83 Optimizations & bugfixes Mikhail Morozov (1-bit BMP)
84 Fabian "ryg" Giesen Anael Seghezzi (is-16-bit query)
85 Arseny Kapoulkine
86 John-Mark Allen
87 
88 Bug & warning fixes
89 Marc LeBlanc David Woo Guillaume George Martins Mozeiko
90 Christpher Lloyd Jerry Jansson Joseph Thomson Phil Jordan
91 Dave Moore Roy Eltham Hayaki Saito Nathan Reed
92 Won Chun Luke Graham Johan Duparc Nick Verigakis
93 the Horde3D community Thomas Ruf Ronny Chevalier github:rlyeh
94 Janez Zemva John Bartholomew Michal Cichon github:romigrou
95 Jonathan Blow Ken Hamada Tero Hanninen github:svdijk
96 Laurent Gomila Cort Stratton Sergio Gonzalez github:snagar
97 Aruelien Pocheville Thibault Reuille Cass Everitt github:Zelex
98 Ryamond Barbiero Paul Du Bois Engin Manap github:grim210
99 Aldo Culquicondor Philipp Wiesemann Dale Weiler github:sammyhw
100 Oriol Ferrer Mesia Josh Tobin Matthew Gregan github:phprus
101 Julian Raschke Gregory Mullen Baldur Karlsson github:poppolopoppo
102 Christian Floisand Kevin Schmidt github:darealshinji
103 Blazej Dariusz Roszkowski github:Michaelangel007
104 */
105 
106 #ifndef STBI_INCLUDE_STB_IMAGE_H
107 #define STBI_INCLUDE_STB_IMAGE_H
108 
109 // DOCUMENTATION
110 //
111 // Limitations:
112 // - no 12-bit-per-channel JPEG
113 // - no JPEGs with arithmetic coding
114 // - GIF always returns *comp=4
115 //
116 // Basic usage (see HDR discussion below for HDR usage):
117 // int x,y,n;
118 // unsigned char *data = stbi_load(filename, &x, &y, &n, 0);
119 // // ... process data if not NULL ...
120 // // ... x = width, y = height, n = # 8-bit components per pixel ...
121 // // ... replace '0' with '1'..'4' to force that many components per pixel
122 // // ... but 'n' will always be the number that it would have been if you said 0
123 // stbi_image_free(data)
124 //
125 // Standard parameters:
126 // int *x -- outputs image width in pixels
127 // int *y -- outputs image height in pixels
128 // int *channels_in_file -- outputs # of image components in image file
129 // int desired_channels -- if non-zero, # of image components requested in result
130 //
131 // The return value from an image loader is an 'unsigned char *' which points
132 // to the pixel data, or NULL on an allocation failure or if the image is
133 // corrupt or invalid. The pixel data consists of *y scanlines of *x pixels,
134 // with each pixel consisting of N interleaved 8-bit components; the first
135 // pixel pointed to is top-left-most in the image. There is no padding between
136 // image scanlines or between pixels, regardless of format. The number of
137 // components N is 'desired_channels' if desired_channels is non-zero, or
138 // *channels_in_file otherwise. If desired_channels is non-zero,
139 // *channels_in_file has the number of components that _would_ have been
140 // output otherwise. E.g. if you set desired_channels to 4, you will always
141 // get RGBA output, but you can check *channels_in_file to see if it's trivially
142 // opaque because e.g. there were only 3 channels in the source image.
143 //
144 // An output image with N components has the following components interleaved
145 // in this order in each pixel:
146 //
147 // N=#comp components
148 // 1 grey
149 // 2 grey, alpha
150 // 3 red, green, blue
151 // 4 red, green, blue, alpha
152 //
153 // If image loading fails for any reason, the return value will be NULL,
154 // and *x, *y, *channels_in_file will be unchanged. The function
155 // stbi_failure_reason() can be queried for an extremely brief, end-user
156 // unfriendly explanation of why the load failed. Define STBI_NO_FAILURE_STRINGS
157 // to avoid compiling these strings at all, and STBI_FAILURE_USERMSG to get slightly
158 // more user-friendly ones.
159 //
160 // Paletted PNG, BMP, GIF, and PIC images are automatically depalettized.
161 //
162 // ===========================================================================
163 //
164 // Philosophy
165 //
166 // stb libraries are designed with the following priorities:
167 //
168 // 1. easy to use
169 // 2. easy to maintain
170 // 3. good performance
171 //
172 // Sometimes I let "good performance" creep up in priority over "easy to maintain",
173 // and for best performance I may provide less-easy-to-use APIs that give higher
174 // performance, in addition to the easy to use ones. Nevertheless, it's important
175 // to keep in mind that from the standpoint of you, a client of this library,
176 // all you care about is #1 and #3, and stb libraries DO NOT emphasize #3 above all.
177 //
178 // Some secondary priorities arise directly from the first two, some of which
179 // make more explicit reasons why performance can't be emphasized.
180 //
181 // - Portable ("ease of use")
182 // - Small source code footprint ("easy to maintain")
183 // - No dependencies ("ease of use")
184 //
185 // ===========================================================================
186 //
187 // I/O callbacks
188 //
189 // I/O callbacks allow you to read from arbitrary sources, like packaged
190 // files or some other source. Data read from callbacks are processed
191 // through a small internal buffer (currently 128 bytes) to try to reduce
192 // overhead.
193 //
194 // The three functions you must define are "read" (reads some bytes of data),
195 // "skip" (skips some bytes of data), "eof" (reports if the stream is at the end).
196 //
197 // ===========================================================================
198 //
199 // SIMD support
200 //
201 // The JPEG decoder will try to automatically use SIMD kernels on x86 when
202 // supported by the compiler. For ARM Neon support, you must explicitly
203 // request it.
204 //
205 // (The old do-it-yourself SIMD API is no longer supported in the current
206 // code.)
207 //
208 // On x86, SSE2 will automatically be used when available based on a run-time
209 // test; if not, the generic C versions are used as a fall-back. On ARM targets,
210 // the typical path is to have separate builds for NEON and non-NEON devices
211 // (at least this is true for iOS and Android). Therefore, the NEON support is
212 // toggled by a build flag: define STBI_NEON to get NEON loops.
213 //
214 // If for some reason you do not want to use any of SIMD code, or if
215 // you have issues compiling it, you can disable it entirely by
216 // defining STBI_NO_SIMD.
217 //
218 // ===========================================================================
219 //
220 // HDR image support (disable by defining STBI_NO_HDR)
221 //
222 // stb_image now supports loading HDR images in general, and currently
223 // the Radiance .HDR file format, although the support is provided
224 // generically. You can still load any file through the existing interface;
225 // if you attempt to load an HDR file, it will be automatically remapped to
226 // LDR, assuming gamma 2.2 and an arbitrary scale factor defaulting to 1;
227 // both of these constants can be reconfigured through this interface:
228 //
229 // stbi_hdr_to_ldr_gamma(2.2f);
230 // stbi_hdr_to_ldr_scale(1.0f);
231 //
232 // (note, do not use _inverse_ constants; stbi_image will invert them
233 // appropriately).
234 //
235 // Additionally, there is a new, parallel interface for loading files as
236 // (linear) floats to preserve the full dynamic range:
237 //
238 // float *data = stbi_loadf(filename, &x, &y, &n, 0);
239 //
240 // If you load LDR images through this interface, those images will
241 // be promoted to floating point values, run through the inverse of
242 // constants corresponding to the above:
243 //
244 // stbi_ldr_to_hdr_scale(1.0f);
245 // stbi_ldr_to_hdr_gamma(2.2f);
246 //
247 // Finally, given a filename (or an open file or memory block--see header
248 // file for details) containing image data, you can query for the "most
249 // appropriate" interface to use (that is, whether the image is HDR or
250 // not), using:
251 //
252 // stbi_is_hdr(char *filename);
253 //
254 // ===========================================================================
255 //
256 // iPhone PNG support:
257 //
258 // By default we convert iphone-formatted PNGs back to RGB, even though
259 // they are internally encoded differently. You can disable this conversion
260 // by by calling stbi_convert_iphone_png_to_rgb(0), in which case
261 // you will always just get the native iphone "format" through (which
262 // is BGR stored in RGB).
263 //
264 // Call stbi_set_unpremultiply_on_load(1) as well to force a divide per
265 // pixel to remove any premultiplied alpha *only* if the image file explicitly
266 // says there's premultiplied data (currently only happens in iPhone images,
267 // and only if iPhone convert-to-rgb processing is on).
268 //
269 // ===========================================================================
270 //
271 // ADDITIONAL CONFIGURATION
272 //
273 // - You can suppress implementation of any of the decoders to reduce
274 // your code footprint by #defining one or more of the following
275 // symbols before creating the implementation.
276 //
277 // STBI_NO_JPEG
278 // STBI_NO_PNG
279 // STBI_NO_BMP
280 // STBI_NO_PSD
281 // STBI_NO_TGA
282 // STBI_NO_GIF
283 // STBI_NO_HDR
284 // STBI_NO_PIC
285 // STBI_NO_PNM (.ppm and .pgm)
286 //
287 // - You can request *only* certain decoders and suppress all other ones
288 // (this will be more forward-compatible, as addition of new decoders
289 // doesn't require you to disable them explicitly):
290 //
291 // STBI_ONLY_JPEG
292 // STBI_ONLY_PNG
293 // STBI_ONLY_BMP
294 // STBI_ONLY_PSD
295 // STBI_ONLY_TGA
296 // STBI_ONLY_GIF
297 // STBI_ONLY_HDR
298 // STBI_ONLY_PIC
299 // STBI_ONLY_PNM (.ppm and .pgm)
300 //
301 // - If you use STBI_NO_PNG (or _ONLY_ without PNG), and you still
302 // want the zlib decoder to be available, #define STBI_SUPPORT_ZLIB
303 //
304 
305 
306 #ifndef STBI_NO_STDIO
307 #include <stdio.h>
308 #endif // STBI_NO_STDIO
309 
310 #define STBI_VERSION 1
311 
312 enum
313 {
314  STBI_default = 0, // only used for desired_channels
315 
316  STBI_grey = 1,
317  STBI_grey_alpha = 2,
318  STBI_rgb = 3,
319  STBI_rgb_alpha = 4
320 };
321 
322 typedef unsigned char stbi_uc;
323 typedef unsigned short stbi_us;
324 
325 #ifdef __cplusplus
326 extern "C" {
327 #endif
328 
329 #ifdef STB_IMAGE_STATIC
330 #define STBIDEF static
331 #else
332 #define STBIDEF extern
333 #endif
334 
336  //
337  // PRIMARY API - works on images of any type
338  //
339 
340  //
341  // load image by filename, open file, or memory buffer
342  //
343 
344  typedef struct
345  {
346  int(*read) (void* user, char* data, int size); // fill 'data' with 'size' bytes. return number of bytes actually read
347  void(*skip) (void* user, int n); // skip the next 'n' bytes, or 'unget' the last -n bytes if negative
348  int(*eof) (void* user); // returns nonzero if we are at end of file/data
350 
352  //
353  // 8-bits-per-channel interface
354  //
355 
356  STBIDEF stbi_uc* stbi_load_from_memory(stbi_uc const* buffer, int len, int* x, int* y, int* channels_in_file, int desired_channels);
357  STBIDEF stbi_uc* stbi_load_from_callbacks(stbi_io_callbacks const* clbk, void* user, int* x, int* y, int* channels_in_file, int desired_channels);
358 #ifndef STBI_NO_GIF
359  STBIDEF stbi_uc* stbi_load_gif_from_memory(stbi_uc const* buffer, int len, int** delays, int* x, int* y, int* z, int* comp, int req_comp);
360 #endif
361 
362 
363 #ifndef STBI_NO_STDIO
364  STBIDEF stbi_uc* stbi_load(char const* filename, int* x, int* y, int* channels_in_file, int desired_channels);
365  STBIDEF stbi_uc* stbi_load_from_file(FILE* f, int* x, int* y, int* channels_in_file, int desired_channels);
366  // for stbi_load_from_file, file pointer is left pointing immediately after image
367 #endif
368 
370  //
371  // 16-bits-per-channel interface
372  //
373 
374  STBIDEF stbi_us* stbi_load_16_from_memory(stbi_uc const* buffer, int len, int* x, int* y, int* channels_in_file, int desired_channels);
375  STBIDEF stbi_us* stbi_load_16_from_callbacks(stbi_io_callbacks const* clbk, void* user, int* x, int* y, int* channels_in_file, int desired_channels);
376 
377 #ifndef STBI_NO_STDIO
378  STBIDEF stbi_us* stbi_load_16(char const* filename, int* x, int* y, int* channels_in_file, int desired_channels);
379  STBIDEF stbi_us* stbi_load_from_file_16(FILE* f, int* x, int* y, int* channels_in_file, int desired_channels);
380 #endif
381 
383  //
384  // float-per-channel interface
385  //
386 #ifndef STBI_NO_LINEAR
387  STBIDEF float* stbi_loadf_from_memory(stbi_uc const* buffer, int len, int* x, int* y, int* channels_in_file, int desired_channels);
388  STBIDEF float* stbi_loadf_from_callbacks(stbi_io_callbacks const* clbk, void* user, int* x, int* y, int* channels_in_file, int desired_channels);
389 
390 #ifndef STBI_NO_STDIO
391  STBIDEF float* stbi_loadf(char const* filename, int* x, int* y, int* channels_in_file, int desired_channels);
392  STBIDEF float* stbi_loadf_from_file(FILE* f, int* x, int* y, int* channels_in_file, int desired_channels);
393 #endif
394 #endif
395 
396 #ifndef STBI_NO_HDR
397  STBIDEF void stbi_hdr_to_ldr_gamma(float gamma);
398  STBIDEF void stbi_hdr_to_ldr_scale(float scale);
399 #endif // STBI_NO_HDR
400 
401 #ifndef STBI_NO_LINEAR
402  STBIDEF void stbi_ldr_to_hdr_gamma(float gamma);
403  STBIDEF void stbi_ldr_to_hdr_scale(float scale);
404 #endif // STBI_NO_LINEAR
405 
406  // stbi_is_hdr is always defined, but always returns false if STBI_NO_HDR
407  STBIDEF int stbi_is_hdr_from_callbacks(stbi_io_callbacks const* clbk, void* user);
408  STBIDEF int stbi_is_hdr_from_memory(stbi_uc const* buffer, int len);
409 #ifndef STBI_NO_STDIO
410  STBIDEF int stbi_is_hdr(char const* filename);
411  STBIDEF int stbi_is_hdr_from_file(FILE* f);
412 #endif // STBI_NO_STDIO
413 
414 
415  // get a VERY brief reason for failure
416  // NOT THREADSAFE
417  STBIDEF const char* stbi_failure_reason(void);
418 
419  // free the loaded image -- this is just free()
420  STBIDEF void stbi_image_free(void* retval_from_stbi_load);
421 
422  // get image dimensions & components without fully decoding
423  STBIDEF int stbi_info_from_memory(stbi_uc const* buffer, int len, int* x, int* y, int* comp);
424  STBIDEF int stbi_info_from_callbacks(stbi_io_callbacks const* clbk, void* user, int* x, int* y, int* comp);
425  STBIDEF int stbi_is_16_bit_from_memory(stbi_uc const* buffer, int len);
426  STBIDEF int stbi_is_16_bit_from_callbacks(stbi_io_callbacks const* clbk, void* user);
427 
428 #ifndef STBI_NO_STDIO
429  STBIDEF int stbi_info(char const* filename, int* x, int* y, int* comp);
430  STBIDEF int stbi_info_from_file(FILE* f, int* x, int* y, int* comp);
431  STBIDEF int stbi_is_16_bit(char const* filename);
432  STBIDEF int stbi_is_16_bit_from_file(FILE* f);
433 #endif
434 
435 
436 
437  // for image formats that explicitly notate that they have premultiplied alpha,
438  // we just return the colors as stored in the file. set this flag to force
439  // unpremultiplication. results are undefined if the unpremultiply overflow.
440  STBIDEF void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply);
441 
442  // indicate whether we should process iphone images back to canonical format,
443  // or just pass them through "as-is"
444  STBIDEF void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert);
445 
446  // flip the image vertically, so the first pixel in the output array is the bottom left
447  STBIDEF void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip);
448 
449  // ZLIB client - used by PNG, available for other purposes
450 
451  STBIDEF char* stbi_zlib_decode_malloc_guesssize(const char* buffer, int len, int initial_size, int* outlen);
452  STBIDEF char* stbi_zlib_decode_malloc_guesssize_headerflag(const char* buffer, int len, int initial_size, int* outlen, int parse_header);
453  STBIDEF char* stbi_zlib_decode_malloc(const char* buffer, int len, int* outlen);
454  STBIDEF int stbi_zlib_decode_buffer(char* obuffer, int olen, const char* ibuffer, int ilen);
455 
456  STBIDEF char* stbi_zlib_decode_noheader_malloc(const char* buffer, int len, int* outlen);
457  STBIDEF int stbi_zlib_decode_noheader_buffer(char* obuffer, int olen, const char* ibuffer, int ilen);
458 
459 
460 #ifdef __cplusplus
461 }
462 #endif
463 
464 //
465 //
467 #endif // STBI_INCLUDE_STB_IMAGE_H
468 
469 #ifdef STB_IMAGE_IMPLEMENTATION
470 
471 #if defined(STBI_ONLY_JPEG) || defined(STBI_ONLY_PNG) || defined(STBI_ONLY_BMP) \
472  || defined(STBI_ONLY_TGA) || defined(STBI_ONLY_GIF) || defined(STBI_ONLY_PSD) \
473  || defined(STBI_ONLY_HDR) || defined(STBI_ONLY_PIC) || defined(STBI_ONLY_PNM) \
474  || defined(STBI_ONLY_ZLIB)
475 #ifndef STBI_ONLY_JPEG
476 #define STBI_NO_JPEG
477 #endif
478 #ifndef STBI_ONLY_PNG
479 #define STBI_NO_PNG
480 #endif
481 #ifndef STBI_ONLY_BMP
482 #define STBI_NO_BMP
483 #endif
484 #ifndef STBI_ONLY_PSD
485 #define STBI_NO_PSD
486 #endif
487 #ifndef STBI_ONLY_TGA
488 #define STBI_NO_TGA
489 #endif
490 #ifndef STBI_ONLY_GIF
491 #define STBI_NO_GIF
492 #endif
493 #ifndef STBI_ONLY_HDR
494 #define STBI_NO_HDR
495 #endif
496 #ifndef STBI_ONLY_PIC
497 #define STBI_NO_PIC
498 #endif
499 #ifndef STBI_ONLY_PNM
500 #define STBI_NO_PNM
501 #endif
502 #endif
503 
504 #if defined(STBI_NO_PNG) && !defined(STBI_SUPPORT_ZLIB) && !defined(STBI_NO_ZLIB)
505 #define STBI_NO_ZLIB
506 #endif
507 
508 
509 #include <stdarg.h>
510 #include <stddef.h> // ptrdiff_t on osx
511 #include <stdlib.h>
512 #include <string.h>
513 #include <limits.h>
514 
515 #if !defined(STBI_NO_LINEAR) || !defined(STBI_NO_HDR)
516 #include <math.h> // ldexp, pow
517 #endif
518 
519 #ifndef STBI_NO_STDIO
520 #include <stdio.h>
521 #endif
522 
523 #ifndef STBI_ASSERT
524 #include <assert.h>
525 #define STBI_ASSERT(x) assert(x)
526 #endif
527 
528 
529 #ifndef _MSC_VER
530 #ifdef __cplusplus
531 #define stbi_inline inline
532 #else
533 #define stbi_inline
534 #endif
535 #else
536 #define stbi_inline __forceinline
537 #endif
538 
539 
540 #ifdef _MSC_VER
541 typedef unsigned short stbi__uint16;
542 typedef signed short stbi__int16;
543 typedef unsigned int stbi__uint32;
544 typedef signed int stbi__int32;
545 #else
546 #include <stdint.h>
547 typedef uint16_t stbi__uint16;
548 typedef int16_t stbi__int16;
549 typedef uint32_t stbi__uint32;
550 typedef int32_t stbi__int32;
551 #endif
552 
553 // should produce compiler error if size is wrong
554 typedef unsigned char validate_uint32[sizeof(stbi__uint32) == 4 ? 1 : -1];
555 
556 #ifdef _MSC_VER
557 #define STBI_NOTUSED(v) (void)(v)
558 #else
559 #define STBI_NOTUSED(v) (void)sizeof(v)
560 #endif
561 
562 #ifdef _MSC_VER
563 #define STBI_HAS_LROTL
564 #endif
565 
566 #ifdef STBI_HAS_LROTL
567 #define stbi_lrot(x,y) _lrotl(x,y)
568 #else
569 #define stbi_lrot(x,y) (((x) << (y)) | ((x) >> (32 - (y))))
570 #endif
571 
572 #if defined(STBI_MALLOC) && defined(STBI_FREE) && (defined(STBI_REALLOC) || defined(STBI_REALLOC_SIZED))
573 // ok
574 #elif !defined(STBI_MALLOC) && !defined(STBI_FREE) && !defined(STBI_REALLOC) && !defined(STBI_REALLOC_SIZED)
575 // ok
576 #else
577 #error "Must define all or none of STBI_MALLOC, STBI_FREE, and STBI_REALLOC (or STBI_REALLOC_SIZED)."
578 #endif
579 
580 #ifndef STBI_MALLOC
581 #define STBI_MALLOC(sz) malloc(sz)
582 #define STBI_REALLOC(p,newsz) realloc(p,newsz)
583 #define STBI_FREE(p) free(p)
584 #endif
585 
586 #ifndef STBI_REALLOC_SIZED
587 #define STBI_REALLOC_SIZED(p,oldsz,newsz) STBI_REALLOC(p,newsz)
588 #endif
589 
590 // x86/x64 detection
591 #if defined(__x86_64__) || defined(_M_X64)
592 #define STBI__X64_TARGET
593 #elif defined(__i386) || defined(_M_IX86)
594 #define STBI__X86_TARGET
595 #endif
596 
597 #if defined(__GNUC__) && defined(STBI__X86_TARGET) && !defined(__SSE2__) && !defined(STBI_NO_SIMD)
598 // gcc doesn't support sse2 intrinsics unless you compile with -msse2,
599 // which in turn means it gets to use SSE2 everywhere. This is unfortunate,
600 // but previous attempts to provide the SSE2 functions with runtime
601 // detection caused numerous issues. The way architecture extensions are
602 // exposed in GCC/Clang is, sadly, not really suited for one-file libs.
603 // New behavior: if compiled with -msse2, we use SSE2 without any
604 // detection; if not, we don't use it at all.
605 #define STBI_NO_SIMD
606 #endif
607 
608 #if defined(__MINGW32__) && defined(STBI__X86_TARGET) && !defined(STBI_MINGW_ENABLE_SSE2) && !defined(STBI_NO_SIMD)
609 // Note that __MINGW32__ doesn't actually mean 32-bit, so we have to avoid STBI__X64_TARGET
610 //
611 // 32-bit MinGW wants ESP to be 16-byte aligned, but this is not in the
612 // Windows ABI and VC++ as well as Windows DLLs don't maintain that invariant.
613 // As a result, enabling SSE2 on 32-bit MinGW is dangerous when not
614 // simultaneously enabling "-mstackrealign".
615 //
616 // See https://github.com/nothings/stb/issues/81 for more information.
617 //
618 // So default to no SSE2 on 32-bit MinGW. If you've read this far and added
619 // -mstackrealign to your build settings, feel free to #define STBI_MINGW_ENABLE_SSE2.
620 #define STBI_NO_SIMD
621 #endif
622 
623 #if !defined(STBI_NO_SIMD) && (defined(STBI__X86_TARGET) || defined(STBI__X64_TARGET))
624 #define STBI_SSE2
625 #include <emmintrin.h>
626 
627 #ifdef _MSC_VER
628 
629 #if _MSC_VER >= 1400 // not VC6
630 #include <intrin.h> // __cpuid
631 static int stbi__cpuid3(void)
632 {
633  int info[4];
634  __cpuid(info, 1);
635  return info[3];
636 }
637 #else
638 static int stbi__cpuid3(void)
639 {
640  int res;
641  __asm {
642  mov eax, 1
643  cpuid
644  mov res, edx
645  }
646  return res;
647 }
648 #endif
649 
650 #define STBI_SIMD_ALIGN(type, name) __declspec(align(16)) type name
651 
652 static int stbi__sse2_available(void)
653 {
654  int info3 = stbi__cpuid3();
655  return ((info3 >> 26) & 1) != 0;
656 }
657 #else // assume GCC-style if not VC++
658 #define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16)))
659 
660 static int stbi__sse2_available(void)
661 {
662  // If we're even attempting to compile this on GCC/Clang, that means
663  // -msse2 is on, which means the compiler is allowed to use SSE2
664  // instructions at will, and so are we.
665  return 1;
666 }
667 #endif
668 #endif
669 
670 // ARM NEON
671 #if defined(STBI_NO_SIMD) && defined(STBI_NEON)
672 #undef STBI_NEON
673 #endif
674 
675 #ifdef STBI_NEON
676 #include <arm_neon.h>
677 // assume GCC or Clang on ARM targets
678 #define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16)))
679 #endif
680 
681 #ifndef STBI_SIMD_ALIGN
682 #define STBI_SIMD_ALIGN(type, name) type name
683 #endif
684 
686 //
687 // stbi__context struct and start_xxx functions
688 
689 // stbi__context structure is our basic context used by all images, so it
690 // contains all the IO context, plus some basic image information
691 typedef struct
692 {
693  stbi__uint32 img_x, img_y;
694  int img_n, img_out_n;
695 
697  void* io_user_data;
698 
699  int read_from_callbacks;
700  int buflen;
701  stbi_uc buffer_start[128];
702 
703  stbi_uc* img_buffer, * img_buffer_end;
704  stbi_uc* img_buffer_original, * img_buffer_original_end;
705 } stbi__context;
706 
707 
708 static void stbi__refill_buffer(stbi__context* s);
709 
710 // initialize a memory-decode context
711 static void stbi__start_mem(stbi__context* s, stbi_uc const* buffer, int len)
712 {
713  s->io.read = NULL;
714  s->read_from_callbacks = 0;
715  s->img_buffer = s->img_buffer_original = (stbi_uc*)buffer;
716  s->img_buffer_end = s->img_buffer_original_end = (stbi_uc*)buffer + len;
717 }
718 
719 // initialize a callback-based context
720 static void stbi__start_callbacks(stbi__context* s, stbi_io_callbacks* c, void* user)
721 {
722  s->io = *c;
723  s->io_user_data = user;
724  s->buflen = sizeof(s->buffer_start);
725  s->read_from_callbacks = 1;
726  s->img_buffer_original = s->buffer_start;
727  stbi__refill_buffer(s);
728  s->img_buffer_original_end = s->img_buffer_end;
729 }
730 
731 #ifndef STBI_NO_STDIO
732 
733 static int stbi__stdio_read(void* user, char* data, int size)
734 {
735  return (int)fread(data, 1, size, (FILE*)user);
736 }
737 
738 static void stbi__stdio_skip(void* user, int n)
739 {
740  fseek((FILE*)user, n, SEEK_CUR);
741 }
742 
743 static int stbi__stdio_eof(void* user)
744 {
745  return feof((FILE*)user);
746 }
747 
748 static stbi_io_callbacks stbi__stdio_callbacks =
749 {
750  stbi__stdio_read,
751  stbi__stdio_skip,
752  stbi__stdio_eof,
753 };
754 
755 static void stbi__start_file(stbi__context* s, FILE* f)
756 {
757  stbi__start_callbacks(s, &stbi__stdio_callbacks, (void*)f);
758 }
759 
760 //static void stop_file(stbi__context *s) { }
761 
762 #endif // !STBI_NO_STDIO
763 
764 static void stbi__rewind(stbi__context* s)
765 {
766  // conceptually rewind SHOULD rewind to the beginning of the stream,
767  // but we just rewind to the beginning of the initial buffer, because
768  // we only use it after doing 'test', which only ever looks at at most 92 bytes
769  s->img_buffer = s->img_buffer_original;
770  s->img_buffer_end = s->img_buffer_original_end;
771 }
772 
773 enum
774 {
775  STBI_ORDER_RGB,
776  STBI_ORDER_BGR
777 };
778 
779 typedef struct
780 {
781  int bits_per_channel;
782  int num_channels;
783  int channel_order;
784 } stbi__result_info;
785 
786 #ifndef STBI_NO_JPEG
787 static int stbi__jpeg_test(stbi__context* s);
788 static void* stbi__jpeg_load(stbi__context* s, int* x, int* y, int* comp, int req_comp, stbi__result_info* ri);
789 static int stbi__jpeg_info(stbi__context* s, int* x, int* y, int* comp);
790 #endif
791 
792 #ifndef STBI_NO_PNG
793 static int stbi__png_test(stbi__context* s);
794 static void* stbi__png_load(stbi__context* s, int* x, int* y, int* comp, int req_comp, stbi__result_info* ri);
795 static int stbi__png_info(stbi__context* s, int* x, int* y, int* comp);
796 static int stbi__png_is16(stbi__context* s);
797 #endif
798 
799 #ifndef STBI_NO_BMP
800 static int stbi__bmp_test(stbi__context* s);
801 static void* stbi__bmp_load(stbi__context* s, int* x, int* y, int* comp, int req_comp, stbi__result_info* ri);
802 static int stbi__bmp_info(stbi__context* s, int* x, int* y, int* comp);
803 #endif
804 
805 #ifndef STBI_NO_TGA
806 static int stbi__tga_test(stbi__context* s);
807 static void* stbi__tga_load(stbi__context* s, int* x, int* y, int* comp, int req_comp, stbi__result_info* ri);
808 static int stbi__tga_info(stbi__context* s, int* x, int* y, int* comp);
809 #endif
810 
811 #ifndef STBI_NO_PSD
812 static int stbi__psd_test(stbi__context* s);
813 static void* stbi__psd_load(stbi__context* s, int* x, int* y, int* comp, int req_comp, stbi__result_info* ri, int bpc);
814 static int stbi__psd_info(stbi__context* s, int* x, int* y, int* comp);
815 static int stbi__psd_is16(stbi__context* s);
816 #endif
817 
818 #ifndef STBI_NO_HDR
819 static int stbi__hdr_test(stbi__context* s);
820 static float* stbi__hdr_load(stbi__context* s, int* x, int* y, int* comp, int req_comp, stbi__result_info* ri);
821 static int stbi__hdr_info(stbi__context* s, int* x, int* y, int* comp);
822 #endif
823 
824 #ifndef STBI_NO_PIC
825 static int stbi__pic_test(stbi__context* s);
826 static void* stbi__pic_load(stbi__context* s, int* x, int* y, int* comp, int req_comp, stbi__result_info* ri);
827 static int stbi__pic_info(stbi__context* s, int* x, int* y, int* comp);
828 #endif
829 
830 #ifndef STBI_NO_GIF
831 static int stbi__gif_test(stbi__context* s);
832 static void* stbi__gif_load(stbi__context* s, int* x, int* y, int* comp, int req_comp, stbi__result_info* ri);
833 static void* stbi__load_gif_main(stbi__context* s, int** delays, int* x, int* y, int* z, int* comp, int req_comp);
834 static int stbi__gif_info(stbi__context* s, int* x, int* y, int* comp);
835 #endif
836 
837 #ifndef STBI_NO_PNM
838 static int stbi__pnm_test(stbi__context* s);
839 static void* stbi__pnm_load(stbi__context* s, int* x, int* y, int* comp, int req_comp, stbi__result_info* ri);
840 static int stbi__pnm_info(stbi__context* s, int* x, int* y, int* comp);
841 #endif
842 
843 // this is not threadsafe
844 static const char* stbi__g_failure_reason;
845 
846 STBIDEF const char* stbi_failure_reason(void)
847 {
848  return stbi__g_failure_reason;
849 }
850 
851 static int stbi__err(const char* str)
852 {
853  stbi__g_failure_reason = str;
854  return 0;
855 }
856 
857 static void* stbi__malloc(size_t size)
858 {
859  return STBI_MALLOC(size);
860 }
861 
862 // stb_image uses ints pervasively, including for offset calculations.
863 // therefore the largest decoded image size we can support with the
864 // current code, even on 64-bit targets, is INT_MAX. this is not a
865 // significant limitation for the intended use case.
866 //
867 // we do, however, need to make sure our size calculations don't
868 // overflow. hence a few helper functions for size calculations that
869 // multiply integers together, making sure that they're non-negative
870 // and no overflow occurs.
871 
872 // return 1 if the sum is valid, 0 on overflow.
873 // negative terms are considered invalid.
874 static int stbi__addsizes_valid(int a, int b)
875 {
876  if (b < 0) return 0;
877  // now 0 <= b <= INT_MAX, hence also
878  // 0 <= INT_MAX - b <= INTMAX.
879  // And "a + b <= INT_MAX" (which might overflow) is the
880  // same as a <= INT_MAX - b (no overflow)
881  return a <= INT_MAX - b;
882 }
883 
884 // returns 1 if the product is valid, 0 on overflow.
885 // negative factors are considered invalid.
886 static int stbi__mul2sizes_valid(int a, int b)
887 {
888  if (a < 0 || b < 0) return 0;
889  if (b == 0) return 1; // mul-by-0 is always safe
890  // portable way to check for no overflows in a*b
891  return a <= INT_MAX / b;
892 }
893 
894 // returns 1 if "a*b + add" has no negative terms/factors and doesn't overflow
895 static int stbi__mad2sizes_valid(int a, int b, int add)
896 {
897  return stbi__mul2sizes_valid(a, b) && stbi__addsizes_valid(a * b, add);
898 }
899 
900 // returns 1 if "a*b*c + add" has no negative terms/factors and doesn't overflow
901 static int stbi__mad3sizes_valid(int a, int b, int c, int add)
902 {
903  return stbi__mul2sizes_valid(a, b) && stbi__mul2sizes_valid(a * b, c) &&
904  stbi__addsizes_valid(a * b * c, add);
905 }
906 
907 // returns 1 if "a*b*c*d + add" has no negative terms/factors and doesn't overflow
908 #if !defined(STBI_NO_LINEAR) || !defined(STBI_NO_HDR)
909 static int stbi__mad4sizes_valid(int a, int b, int c, int d, int add)
910 {
911  return stbi__mul2sizes_valid(a, b) && stbi__mul2sizes_valid(a * b, c) &&
912  stbi__mul2sizes_valid(a * b * c, d) && stbi__addsizes_valid(a * b * c * d, add);
913 }
914 #endif
915 
916 // mallocs with size overflow checking
917 static void* stbi__malloc_mad2(int a, int b, int add)
918 {
919  if (!stbi__mad2sizes_valid(a, b, add)) return NULL;
920  return stbi__malloc(a * b + add);
921 }
922 
923 static void* stbi__malloc_mad3(int a, int b, int c, int add)
924 {
925  if (!stbi__mad3sizes_valid(a, b, c, add)) return NULL;
926  return stbi__malloc(a * b * c + add);
927 }
928 
929 #if !defined(STBI_NO_LINEAR) || !defined(STBI_NO_HDR)
930 static void* stbi__malloc_mad4(int a, int b, int c, int d, int add)
931 {
932  if (!stbi__mad4sizes_valid(a, b, c, d, add)) return NULL;
933  return stbi__malloc(a * b * c * d + add);
934 }
935 #endif
936 
937 // stbi__err - error
938 // stbi__errpf - error returning pointer to float
939 // stbi__errpuc - error returning pointer to unsigned char
940 
941 #ifdef STBI_NO_FAILURE_STRINGS
942 #define stbi__err(x,y) 0
943 #elif defined(STBI_FAILURE_USERMSG)
944 #define stbi__err(x,y) stbi__err(y)
945 #else
946 #define stbi__err(x,y) stbi__err(x)
947 #endif
948 
949 #define stbi__errpf(x,y) ((float *)(size_t) (stbi__err(x,y)?NULL:NULL))
950 #define stbi__errpuc(x,y) ((unsigned char *)(size_t) (stbi__err(x,y)?NULL:NULL))
951 
952 STBIDEF void stbi_image_free(void* retval_from_stbi_load)
953 {
954  STBI_FREE(retval_from_stbi_load);
955 }
956 
957 #ifndef STBI_NO_LINEAR
958 static float* stbi__ldr_to_hdr(stbi_uc* data, int x, int y, int comp);
959 #endif
960 
961 #ifndef STBI_NO_HDR
962 static stbi_uc* stbi__hdr_to_ldr(float* data, int x, int y, int comp);
963 #endif
964 
965 static int stbi__vertically_flip_on_load = 0;
966 
967 STBIDEF void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip)
968 {
969  stbi__vertically_flip_on_load = flag_true_if_should_flip;
970 }
971 
972 static void* stbi__load_main(stbi__context* s, int* x, int* y, int* comp, int req_comp, stbi__result_info* ri, int bpc)
973 {
974  memset(ri, 0, sizeof(*ri)); // make sure it's initialized if we add new fields
975  ri->bits_per_channel = 8; // default is 8 so most paths don't have to be changed
976  ri->channel_order = STBI_ORDER_RGB; // all current input & output are this, but this is here so we can add BGR order
977  ri->num_channels = 0;
978 
979 #ifndef STBI_NO_JPEG
980  if (stbi__jpeg_test(s)) return stbi__jpeg_load(s, x, y, comp, req_comp, ri);
981 #endif
982 #ifndef STBI_NO_PNG
983  if (stbi__png_test(s)) return stbi__png_load(s, x, y, comp, req_comp, ri);
984 #endif
985 #ifndef STBI_NO_BMP
986  if (stbi__bmp_test(s)) return stbi__bmp_load(s, x, y, comp, req_comp, ri);
987 #endif
988 #ifndef STBI_NO_GIF
989  if (stbi__gif_test(s)) return stbi__gif_load(s, x, y, comp, req_comp, ri);
990 #endif
991 #ifndef STBI_NO_PSD
992  if (stbi__psd_test(s)) return stbi__psd_load(s, x, y, comp, req_comp, ri, bpc);
993 #endif
994 #ifndef STBI_NO_PIC
995  if (stbi__pic_test(s)) return stbi__pic_load(s, x, y, comp, req_comp, ri);
996 #endif
997 #ifndef STBI_NO_PNM
998  if (stbi__pnm_test(s)) return stbi__pnm_load(s, x, y, comp, req_comp, ri);
999 #endif
1000 
1001 #ifndef STBI_NO_HDR
1002  if (stbi__hdr_test(s)) {
1003  float* hdr = stbi__hdr_load(s, x, y, comp, req_comp, ri);
1004  return stbi__hdr_to_ldr(hdr, *x, *y, req_comp ? req_comp : *comp);
1005  }
1006 #endif
1007 
1008 #ifndef STBI_NO_TGA
1009  // test tga last because it's a crappy test!
1010  if (stbi__tga_test(s))
1011  return stbi__tga_load(s, x, y, comp, req_comp, ri);
1012 #endif
1013 
1014  return stbi__errpuc("unknown image type", "Image not of any known type, or corrupt");
1015 }
1016 
1017 static stbi_uc* stbi__convert_16_to_8(stbi__uint16* orig, int w, int h, int channels)
1018 {
1019  int i;
1020  int img_len = w * h * channels;
1021  stbi_uc* reduced;
1022 
1023  reduced = (stbi_uc*)stbi__malloc(img_len);
1024  if (reduced == NULL) return stbi__errpuc("outofmem", "Out of memory");
1025 
1026  for (i = 0; i < img_len; ++i)
1027  reduced[i] = (stbi_uc)((orig[i] >> 8) & 0xFF); // top half of each byte is sufficient approx of 16->8 bit scaling
1028 
1029  STBI_FREE(orig);
1030  return reduced;
1031 }
1032 
1033 static stbi__uint16* stbi__convert_8_to_16(stbi_uc* orig, int w, int h, int channels)
1034 {
1035  int i;
1036  int img_len = w * h * channels;
1037  stbi__uint16* enlarged;
1038 
1039  enlarged = (stbi__uint16*)stbi__malloc(img_len * 2);
1040  if (enlarged == NULL) return (stbi__uint16*)stbi__errpuc("outofmem", "Out of memory");
1041 
1042  for (i = 0; i < img_len; ++i)
1043  enlarged[i] = (stbi__uint16)((orig[i] << 8) + orig[i]); // replicate to high and low byte, maps 0->0, 255->0xffff
1044 
1045  STBI_FREE(orig);
1046  return enlarged;
1047 }
1048 
1049 static void stbi__vertical_flip(void* image, int w, int h, int bytes_per_pixel)
1050 {
1051  int row;
1052  size_t bytes_per_row = (size_t)w * bytes_per_pixel;
1053  stbi_uc temp[2048];
1054  stbi_uc* bytes = (stbi_uc*)image;
1055 
1056  for (row = 0; row < (h >> 1); row++) {
1057  stbi_uc* row0 = bytes + row * bytes_per_row;
1058  stbi_uc* row1 = bytes + (h - row - 1) * bytes_per_row;
1059  // swap row0 with row1
1060  size_t bytes_left = bytes_per_row;
1061  while (bytes_left) {
1062  size_t bytes_copy = (bytes_left < sizeof(temp)) ? bytes_left : sizeof(temp);
1063  memcpy(temp, row0, bytes_copy);
1064  memcpy(row0, row1, bytes_copy);
1065  memcpy(row1, temp, bytes_copy);
1066  row0 += bytes_copy;
1067  row1 += bytes_copy;
1068  bytes_left -= bytes_copy;
1069  }
1070  }
1071 }
1072 
1073 static void stbi__vertical_flip_slices(void* image, int w, int h, int z, int bytes_per_pixel)
1074 {
1075  int slice;
1076  int slice_size = w * h * bytes_per_pixel;
1077 
1078  stbi_uc* bytes = (stbi_uc*)image;
1079  for (slice = 0; slice < z; ++slice) {
1080  stbi__vertical_flip(bytes, w, h, bytes_per_pixel);
1081  bytes += slice_size;
1082  }
1083 }
1084 
1085 static unsigned char* stbi__load_and_postprocess_8bit(stbi__context* s, int* x, int* y, int* comp, int req_comp)
1086 {
1087  stbi__result_info ri;
1088  void* result = stbi__load_main(s, x, y, comp, req_comp, &ri, 8);
1089 
1090  if (result == NULL)
1091  return NULL;
1092 
1093  if (ri.bits_per_channel != 8) {
1094  STBI_ASSERT(ri.bits_per_channel == 16);
1095  result = stbi__convert_16_to_8((stbi__uint16*)result, *x, *y, req_comp == 0 ? *comp : req_comp);
1096  ri.bits_per_channel = 8;
1097  }
1098 
1099  // @TODO: move stbi__convert_format to here
1100 
1101  if (stbi__vertically_flip_on_load) {
1102  int channels = req_comp ? req_comp : *comp;
1103  stbi__vertical_flip(result, *x, *y, channels * sizeof(stbi_uc));
1104  }
1105 
1106  return (unsigned char*)result;
1107 }
1108 
1109 static stbi__uint16* stbi__load_and_postprocess_16bit(stbi__context* s, int* x, int* y, int* comp, int req_comp)
1110 {
1111  stbi__result_info ri;
1112  void* result = stbi__load_main(s, x, y, comp, req_comp, &ri, 16);
1113 
1114  if (result == NULL)
1115  return NULL;
1116 
1117  if (ri.bits_per_channel != 16) {
1118  STBI_ASSERT(ri.bits_per_channel == 8);
1119  result = stbi__convert_8_to_16((stbi_uc*)result, *x, *y, req_comp == 0 ? *comp : req_comp);
1120  ri.bits_per_channel = 16;
1121  }
1122 
1123  // @TODO: move stbi__convert_format16 to here
1124  // @TODO: special case RGB-to-Y (and RGBA-to-YA) for 8-bit-to-16-bit case to keep more precision
1125 
1126  if (stbi__vertically_flip_on_load) {
1127  int channels = req_comp ? req_comp : *comp;
1128  stbi__vertical_flip(result, *x, *y, channels * sizeof(stbi__uint16));
1129  }
1130 
1131  return (stbi__uint16*)result;
1132 }
1133 
1134 #if !defined(STBI_NO_HDR) || !defined(STBI_NO_LINEAR)
1135 static void stbi__float_postprocess(float* result, int* x, int* y, int* comp, int req_comp)
1136 {
1137  if (stbi__vertically_flip_on_load && result != NULL) {
1138  int channels = req_comp ? req_comp : *comp;
1139  stbi__vertical_flip(result, *x, *y, channels * sizeof(float));
1140  }
1141 }
1142 #endif
1143 
1144 #ifndef STBI_NO_STDIO
1145 
1146 static FILE* stbi__fopen(char const* filename, char const* mode)
1147 {
1148  FILE* f;
1149 #if defined(_MSC_VER) && _MSC_VER >= 1400
1150  if (0 != fopen_s(&f, filename, mode))
1151  f = 0;
1152 #else
1153  f = fopen(filename, mode);
1154 #endif
1155  return f;
1156 }
1157 
1158 
1159 STBIDEF stbi_uc* stbi_load(char const* filename, int* x, int* y, int* comp, int req_comp)
1160 {
1161  FILE* f = stbi__fopen(filename, "rb");
1162  unsigned char* result;
1163  if (!f) return stbi__errpuc("can't fopen", "Unable to open file");
1164  result = stbi_load_from_file(f, x, y, comp, req_comp);
1165  fclose(f);
1166  return result;
1167 }
1168 
1169 STBIDEF stbi_uc* stbi_load_from_file(FILE* f, int* x, int* y, int* comp, int req_comp)
1170 {
1171  unsigned char* result;
1172  stbi__context s;
1173  stbi__start_file(&s, f);
1174  result = stbi__load_and_postprocess_8bit(&s, x, y, comp, req_comp);
1175  if (result) {
1176  // need to 'unget' all the characters in the IO buffer
1177  fseek(f, -(int)(s.img_buffer_end - s.img_buffer), SEEK_CUR);
1178  }
1179  return result;
1180 }
1181 
1182 STBIDEF stbi__uint16* stbi_load_from_file_16(FILE* f, int* x, int* y, int* comp, int req_comp)
1183 {
1184  stbi__uint16* result;
1185  stbi__context s;
1186  stbi__start_file(&s, f);
1187  result = stbi__load_and_postprocess_16bit(&s, x, y, comp, req_comp);
1188  if (result) {
1189  // need to 'unget' all the characters in the IO buffer
1190  fseek(f, -(int)(s.img_buffer_end - s.img_buffer), SEEK_CUR);
1191  }
1192  return result;
1193 }
1194 
1195 STBIDEF stbi_us* stbi_load_16(char const* filename, int* x, int* y, int* comp, int req_comp)
1196 {
1197  FILE* f = stbi__fopen(filename, "rb");
1198  stbi__uint16* result;
1199  if (!f) return (stbi_us*)stbi__errpuc("can't fopen", "Unable to open file");
1200  result = stbi_load_from_file_16(f, x, y, comp, req_comp);
1201  fclose(f);
1202  return result;
1203 }
1204 
1205 
1206 #endif
1207 
1208 STBIDEF stbi_us* stbi_load_16_from_memory(stbi_uc const* buffer, int len, int* x, int* y, int* channels_in_file, int desired_channels)
1209 {
1210  stbi__context s;
1211  stbi__start_mem(&s, buffer, len);
1212  return stbi__load_and_postprocess_16bit(&s, x, y, channels_in_file, desired_channels);
1213 }
1214 
1215 STBIDEF stbi_us* stbi_load_16_from_callbacks(stbi_io_callbacks const* clbk, void* user, int* x, int* y, int* channels_in_file, int desired_channels)
1216 {
1217  stbi__context s;
1218  stbi__start_callbacks(&s, (stbi_io_callbacks*)clbk, user);
1219  return stbi__load_and_postprocess_16bit(&s, x, y, channels_in_file, desired_channels);
1220 }
1221 
1222 STBIDEF stbi_uc* stbi_load_from_memory(stbi_uc const* buffer, int len, int* x, int* y, int* comp, int req_comp)
1223 {
1224  stbi__context s;
1225  stbi__start_mem(&s, buffer, len);
1226  return stbi__load_and_postprocess_8bit(&s, x, y, comp, req_comp);
1227 }
1228 
1229 STBIDEF stbi_uc* stbi_load_from_callbacks(stbi_io_callbacks const* clbk, void* user, int* x, int* y, int* comp, int req_comp)
1230 {
1231  stbi__context s;
1232  stbi__start_callbacks(&s, (stbi_io_callbacks*)clbk, user);
1233  return stbi__load_and_postprocess_8bit(&s, x, y, comp, req_comp);
1234 }
1235 
1236 #ifndef STBI_NO_GIF
1237 STBIDEF stbi_uc* stbi_load_gif_from_memory(stbi_uc const* buffer, int len, int** delays, int* x, int* y, int* z, int* comp, int req_comp)
1238 {
1239  unsigned char* result;
1240  stbi__context s;
1241  stbi__start_mem(&s, buffer, len);
1242 
1243  result = (unsigned char*)stbi__load_gif_main(&s, delays, x, y, z, comp, req_comp);
1244  if (stbi__vertically_flip_on_load) {
1245  stbi__vertical_flip_slices(result, *x, *y, *z, *comp);
1246  }
1247 
1248  return result;
1249 }
1250 #endif
1251 
1252 #ifndef STBI_NO_LINEAR
1253 static float* stbi__loadf_main(stbi__context* s, int* x, int* y, int* comp, int req_comp)
1254 {
1255  unsigned char* data;
1256 #ifndef STBI_NO_HDR
1257  if (stbi__hdr_test(s)) {
1258  stbi__result_info ri;
1259  float* hdr_data = stbi__hdr_load(s, x, y, comp, req_comp, &ri);
1260  if (hdr_data)
1261  stbi__float_postprocess(hdr_data, x, y, comp, req_comp);
1262  return hdr_data;
1263  }
1264 #endif
1265  data = stbi__load_and_postprocess_8bit(s, x, y, comp, req_comp);
1266  if (data)
1267  return stbi__ldr_to_hdr(data, *x, *y, req_comp ? req_comp : *comp);
1268  return stbi__errpf("unknown image type", "Image not of any known type, or corrupt");
1269 }
1270 
1271 STBIDEF float* stbi_loadf_from_memory(stbi_uc const* buffer, int len, int* x, int* y, int* comp, int req_comp)
1272 {
1273  stbi__context s;
1274  stbi__start_mem(&s, buffer, len);
1275  return stbi__loadf_main(&s, x, y, comp, req_comp);
1276 }
1277 
1278 STBIDEF float* stbi_loadf_from_callbacks(stbi_io_callbacks const* clbk, void* user, int* x, int* y, int* comp, int req_comp)
1279 {
1280  stbi__context s;
1281  stbi__start_callbacks(&s, (stbi_io_callbacks*)clbk, user);
1282  return stbi__loadf_main(&s, x, y, comp, req_comp);
1283 }
1284 
1285 #ifndef STBI_NO_STDIO
1286 STBIDEF float* stbi_loadf(char const* filename, int* x, int* y, int* comp, int req_comp)
1287 {
1288  float* result;
1289  FILE* f = stbi__fopen(filename, "rb");
1290  if (!f) return stbi__errpf("can't fopen", "Unable to open file");
1291  result = stbi_loadf_from_file(f, x, y, comp, req_comp);
1292  fclose(f);
1293  return result;
1294 }
1295 
1296 STBIDEF float* stbi_loadf_from_file(FILE* f, int* x, int* y, int* comp, int req_comp)
1297 {
1298  stbi__context s;
1299  stbi__start_file(&s, f);
1300  return stbi__loadf_main(&s, x, y, comp, req_comp);
1301 }
1302 #endif // !STBI_NO_STDIO
1303 
1304 #endif // !STBI_NO_LINEAR
1305 
1306 // these is-hdr-or-not is defined independent of whether STBI_NO_LINEAR is
1307 // defined, for API simplicity; if STBI_NO_LINEAR is defined, it always
1308 // reports false!
1309 
1310 STBIDEF int stbi_is_hdr_from_memory(stbi_uc const* buffer, int len)
1311 {
1312 #ifndef STBI_NO_HDR
1313  stbi__context s;
1314  stbi__start_mem(&s, buffer, len);
1315  return stbi__hdr_test(&s);
1316 #else
1317  STBI_NOTUSED(buffer);
1318  STBI_NOTUSED(len);
1319  return 0;
1320 #endif
1321 }
1322 
1323 #ifndef STBI_NO_STDIO
1324 STBIDEF int stbi_is_hdr(char const* filename)
1325 {
1326  FILE* f = stbi__fopen(filename, "rb");
1327  int result = 0;
1328  if (f) {
1329  result = stbi_is_hdr_from_file(f);
1330  fclose(f);
1331  }
1332  return result;
1333 }
1334 
1335 STBIDEF int stbi_is_hdr_from_file(FILE* f)
1336 {
1337 #ifndef STBI_NO_HDR
1338  long pos = ftell(f);
1339  int res;
1340  stbi__context s;
1341  stbi__start_file(&s, f);
1342  res = stbi__hdr_test(&s);
1343  fseek(f, pos, SEEK_SET);
1344  return res;
1345 #else
1346  STBI_NOTUSED(f);
1347  return 0;
1348 #endif
1349 }
1350 #endif // !STBI_NO_STDIO
1351 
1352 STBIDEF int stbi_is_hdr_from_callbacks(stbi_io_callbacks const* clbk, void* user)
1353 {
1354 #ifndef STBI_NO_HDR
1355  stbi__context s;
1356  stbi__start_callbacks(&s, (stbi_io_callbacks*)clbk, user);
1357  return stbi__hdr_test(&s);
1358 #else
1359  STBI_NOTUSED(clbk);
1360  STBI_NOTUSED(user);
1361  return 0;
1362 #endif
1363 }
1364 
1365 #ifndef STBI_NO_LINEAR
1366 static float stbi__l2h_gamma = 2.2f, stbi__l2h_scale = 1.0f;
1367 
1368 STBIDEF void stbi_ldr_to_hdr_gamma(float gamma) { stbi__l2h_gamma = gamma; }
1369 STBIDEF void stbi_ldr_to_hdr_scale(float scale) { stbi__l2h_scale = scale; }
1370 #endif
1371 
1372 static float stbi__h2l_gamma_i = 1.0f / 2.2f, stbi__h2l_scale_i = 1.0f;
1373 
1374 STBIDEF void stbi_hdr_to_ldr_gamma(float gamma) { stbi__h2l_gamma_i = 1 / gamma; }
1375 STBIDEF void stbi_hdr_to_ldr_scale(float scale) { stbi__h2l_scale_i = 1 / scale; }
1376 
1377 
1379 //
1380 // Common code used by all image loaders
1381 //
1382 
1383 enum
1384 {
1385  STBI__SCAN_load = 0,
1386  STBI__SCAN_type,
1387  STBI__SCAN_header
1388 };
1389 
1390 static void stbi__refill_buffer(stbi__context* s)
1391 {
1392  int n = (s->io.read)(s->io_user_data, (char*)s->buffer_start, s->buflen);
1393  if (n == 0) {
1394  // at end of file, treat same as if from memory, but need to handle case
1395  // where s->img_buffer isn't pointing to safe memory, e.g. 0-byte file
1396  s->read_from_callbacks = 0;
1397  s->img_buffer = s->buffer_start;
1398  s->img_buffer_end = s->buffer_start + 1;
1399  *s->img_buffer = 0;
1400  }
1401  else {
1402  s->img_buffer = s->buffer_start;
1403  s->img_buffer_end = s->buffer_start + n;
1404  }
1405 }
1406 
1407 stbi_inline static stbi_uc stbi__get8(stbi__context* s)
1408 {
1409  if (s->img_buffer < s->img_buffer_end)
1410  return *s->img_buffer++;
1411  if (s->read_from_callbacks) {
1412  stbi__refill_buffer(s);
1413  return *s->img_buffer++;
1414  }
1415  return 0;
1416 }
1417 
1418 stbi_inline static int stbi__at_eof(stbi__context* s)
1419 {
1420  if (s->io.read) {
1421  if (!(s->io.eof)(s->io_user_data)) return 0;
1422  // if feof() is true, check if buffer = end
1423  // special case: we've only got the special 0 character at the end
1424  if (s->read_from_callbacks == 0) return 1;
1425  }
1426 
1427  return s->img_buffer >= s->img_buffer_end;
1428 }
1429 
1430 static void stbi__skip(stbi__context* s, int n)
1431 {
1432  if (n < 0) {
1433  s->img_buffer = s->img_buffer_end;
1434  return;
1435  }
1436  if (s->io.read) {
1437  int blen = (int)(s->img_buffer_end - s->img_buffer);
1438  if (blen < n) {
1439  s->img_buffer = s->img_buffer_end;
1440  (s->io.skip)(s->io_user_data, n - blen);
1441  return;
1442  }
1443  }
1444  s->img_buffer += n;
1445 }
1446 
1447 static int stbi__getn(stbi__context* s, stbi_uc* buffer, int n)
1448 {
1449  if (s->io.read) {
1450  int blen = (int)(s->img_buffer_end - s->img_buffer);
1451  if (blen < n) {
1452  int res, count;
1453 
1454  memcpy(buffer, s->img_buffer, blen);
1455 
1456  count = (s->io.read)(s->io_user_data, (char*)buffer + blen, n - blen);
1457  res = (count == (n - blen));
1458  s->img_buffer = s->img_buffer_end;
1459  return res;
1460  }
1461  }
1462 
1463  if (s->img_buffer + n <= s->img_buffer_end) {
1464  memcpy(buffer, s->img_buffer, n);
1465  s->img_buffer += n;
1466  return 1;
1467  }
1468  else
1469  return 0;
1470 }
1471 
1472 static int stbi__get16be(stbi__context* s)
1473 {
1474  int z = stbi__get8(s);
1475  return (z << 8) + stbi__get8(s);
1476 }
1477 
1478 static stbi__uint32 stbi__get32be(stbi__context* s)
1479 {
1480  stbi__uint32 z = stbi__get16be(s);
1481  return (z << 16) + stbi__get16be(s);
1482 }
1483 
1484 #if defined(STBI_NO_BMP) && defined(STBI_NO_TGA) && defined(STBI_NO_GIF)
1485 // nothing
1486 #else
1487 static int stbi__get16le(stbi__context* s)
1488 {
1489  int z = stbi__get8(s);
1490  return z + (stbi__get8(s) << 8);
1491 }
1492 #endif
1493 
1494 #ifndef STBI_NO_BMP
1495 static stbi__uint32 stbi__get32le(stbi__context* s)
1496 {
1497  stbi__uint32 z = stbi__get16le(s);
1498  return z + (stbi__get16le(s) << 16);
1499 }
1500 #endif
1501 
1502 #define STBI__BYTECAST(x) ((stbi_uc) ((x) & 255)) // truncate int to byte without warnings
1503 
1504 
1506 //
1507 // generic converter from built-in img_n to req_comp
1508 // individual types do this automatically as much as possible (e.g. jpeg
1509 // does all cases internally since it needs to colorspace convert anyway,
1510 // and it never has alpha, so very few cases ). png can automatically
1511 // interleave an alpha=255 channel, but falls back to this for other cases
1512 //
1513 // assume data buffer is malloced, so malloc a new one and free that one
1514 // only failure mode is malloc failing
1515 
1516 static stbi_uc stbi__compute_y(int r, int g, int b)
1517 {
1518  return (stbi_uc)(((r * 77) + (g * 150) + (29 * b)) >> 8);
1519 }
1520 
1521 static unsigned char* stbi__convert_format(unsigned char* data, int img_n, int req_comp, unsigned int x, unsigned int y)
1522 {
1523  int i, j;
1524  unsigned char* good;
1525 
1526  if (req_comp == img_n) return data;
1527  STBI_ASSERT(req_comp >= 1 && req_comp <= 4);
1528 
1529  good = (unsigned char*)stbi__malloc_mad3(req_comp, x, y, 0);
1530  if (good == NULL) {
1531  STBI_FREE(data);
1532  return stbi__errpuc("outofmem", "Out of memory");
1533  }
1534 
1535  for (j = 0; j < (int)y; ++j) {
1536  unsigned char* src = data + j * x * img_n;
1537  unsigned char* dest = good + j * x * req_comp;
1538 
1539 #define STBI__COMBO(a,b) ((a)*8+(b))
1540 #define STBI__CASE(a,b) case STBI__COMBO(a,b): for(i=x-1; i >= 0; --i, src += a, dest += b)
1541  // convert source image with img_n components to one with req_comp components;
1542  // avoid switch per pixel, so use switch per scanline and massive macros
1543  switch (STBI__COMBO(img_n, req_comp)) {
1544  STBI__CASE(1, 2) { dest[0] = src[0], dest[1] = 255; } break;
1545  STBI__CASE(1, 3) { dest[0] = dest[1] = dest[2] = src[0]; } break;
1546  STBI__CASE(1, 4) { dest[0] = dest[1] = dest[2] = src[0], dest[3] = 255; } break;
1547  STBI__CASE(2, 1) { dest[0] = src[0]; } break;
1548  STBI__CASE(2, 3) { dest[0] = dest[1] = dest[2] = src[0]; } break;
1549  STBI__CASE(2, 4) { dest[0] = dest[1] = dest[2] = src[0], dest[3] = src[1]; } break;
1550  STBI__CASE(3, 4) { dest[0] = src[0], dest[1] = src[1], dest[2] = src[2], dest[3] = 255; } break;
1551  STBI__CASE(3, 1) { dest[0] = stbi__compute_y(src[0], src[1], src[2]); } break;
1552  STBI__CASE(3, 2) { dest[0] = stbi__compute_y(src[0], src[1], src[2]), dest[1] = 255; } break;
1553  STBI__CASE(4, 1) { dest[0] = stbi__compute_y(src[0], src[1], src[2]); } break;
1554  STBI__CASE(4, 2) { dest[0] = stbi__compute_y(src[0], src[1], src[2]), dest[1] = src[3]; } break;
1555  STBI__CASE(4, 3) { dest[0] = src[0], dest[1] = src[1], dest[2] = src[2]; } break;
1556  default: STBI_ASSERT(0);
1557  }
1558 #undef STBI__CASE
1559  }
1560 
1561  STBI_FREE(data);
1562  return good;
1563 }
1564 
1565 static stbi__uint16 stbi__compute_y_16(int r, int g, int b)
1566 {
1567  return (stbi__uint16)(((r * 77) + (g * 150) + (29 * b)) >> 8);
1568 }
1569 
1570 static stbi__uint16* stbi__convert_format16(stbi__uint16* data, int img_n, int req_comp, unsigned int x, unsigned int y)
1571 {
1572  int i, j;
1573  stbi__uint16* good;
1574 
1575  if (req_comp == img_n) return data;
1576  STBI_ASSERT(req_comp >= 1 && req_comp <= 4);
1577 
1578  good = (stbi__uint16*)stbi__malloc(req_comp * x * y * 2);
1579  if (good == NULL) {
1580  STBI_FREE(data);
1581  return (stbi__uint16*)stbi__errpuc("outofmem", "Out of memory");
1582  }
1583 
1584  for (j = 0; j < (int)y; ++j) {
1585  stbi__uint16* src = data + j * x * img_n;
1586  stbi__uint16* dest = good + j * x * req_comp;
1587 
1588 #define STBI__COMBO(a,b) ((a)*8+(b))
1589 #define STBI__CASE(a,b) case STBI__COMBO(a,b): for(i=x-1; i >= 0; --i, src += a, dest += b)
1590  // convert source image with img_n components to one with req_comp components;
1591  // avoid switch per pixel, so use switch per scanline and massive macros
1592  switch (STBI__COMBO(img_n, req_comp)) {
1593  STBI__CASE(1, 2) { dest[0] = src[0], dest[1] = 0xffff; } break;
1594  STBI__CASE(1, 3) { dest[0] = dest[1] = dest[2] = src[0]; } break;
1595  STBI__CASE(1, 4) { dest[0] = dest[1] = dest[2] = src[0], dest[3] = 0xffff; } break;
1596  STBI__CASE(2, 1) { dest[0] = src[0]; } break;
1597  STBI__CASE(2, 3) { dest[0] = dest[1] = dest[2] = src[0]; } break;
1598  STBI__CASE(2, 4) { dest[0] = dest[1] = dest[2] = src[0], dest[3] = src[1]; } break;
1599  STBI__CASE(3, 4) { dest[0] = src[0], dest[1] = src[1], dest[2] = src[2], dest[3] = 0xffff; } break;
1600  STBI__CASE(3, 1) { dest[0] = stbi__compute_y_16(src[0], src[1], src[2]); } break;
1601  STBI__CASE(3, 2) { dest[0] = stbi__compute_y_16(src[0], src[1], src[2]), dest[1] = 0xffff; } break;
1602  STBI__CASE(4, 1) { dest[0] = stbi__compute_y_16(src[0], src[1], src[2]); } break;
1603  STBI__CASE(4, 2) { dest[0] = stbi__compute_y_16(src[0], src[1], src[2]), dest[1] = src[3]; } break;
1604  STBI__CASE(4, 3) { dest[0] = src[0], dest[1] = src[1], dest[2] = src[2]; } break;
1605  default: STBI_ASSERT(0);
1606  }
1607 #undef STBI__CASE
1608  }
1609 
1610  STBI_FREE(data);
1611  return good;
1612 }
1613 
1614 #ifndef STBI_NO_LINEAR
1615 static float* stbi__ldr_to_hdr(stbi_uc* data, int x, int y, int comp)
1616 {
1617  int i, k, n;
1618  float* output;
1619  if (!data) return NULL;
1620  output = (float*)stbi__malloc_mad4(x, y, comp, sizeof(float), 0);
1621  if (output == NULL) { STBI_FREE(data); return stbi__errpf("outofmem", "Out of memory"); }
1622  // compute number of non-alpha components
1623  if (comp & 1) n = comp; else n = comp - 1;
1624  for (i = 0; i < x * y; ++i) {
1625  for (k = 0; k < n; ++k) {
1626  output[i * comp + k] = (float)(pow(data[i * comp + k] / 255.0f, stbi__l2h_gamma) * stbi__l2h_scale);
1627  }
1628  if (k < comp) output[i * comp + k] = data[i * comp + k] / 255.0f;
1629  }
1630  STBI_FREE(data);
1631  return output;
1632 }
1633 #endif
1634 
1635 #ifndef STBI_NO_HDR
1636 #define stbi__float2int(x) ((int) (x))
1637 static stbi_uc* stbi__hdr_to_ldr(float* data, int x, int y, int comp)
1638 {
1639  int i, k, n;
1640  stbi_uc* output;
1641  if (!data) return NULL;
1642  output = (stbi_uc*)stbi__malloc_mad3(x, y, comp, 0);
1643  if (output == NULL) { STBI_FREE(data); return stbi__errpuc("outofmem", "Out of memory"); }
1644  // compute number of non-alpha components
1645  if (comp & 1) n = comp; else n = comp - 1;
1646  for (i = 0; i < x * y; ++i) {
1647  for (k = 0; k < n; ++k) {
1648  float z = (float)pow(data[i * comp + k] * stbi__h2l_scale_i, stbi__h2l_gamma_i) * 255 + 0.5f;
1649  if (z < 0) z = 0;
1650  if (z > 255) z = 255;
1651  output[i * comp + k] = (stbi_uc)stbi__float2int(z);
1652  }
1653  if (k < comp) {
1654  float z = data[i * comp + k] * 255 + 0.5f;
1655  if (z < 0) z = 0;
1656  if (z > 255) z = 255;
1657  output[i * comp + k] = (stbi_uc)stbi__float2int(z);
1658  }
1659  }
1660  STBI_FREE(data);
1661  return output;
1662 }
1663 #endif
1664 
1666 //
1667 // "baseline" JPEG/JFIF decoder
1668 //
1669 // simple implementation
1670 // - doesn't support delayed output of y-dimension
1671 // - simple interface (only one output format: 8-bit interleaved RGB)
1672 // - doesn't try to recover corrupt jpegs
1673 // - doesn't allow partial loading, loading multiple at once
1674 // - still fast on x86 (copying globals into locals doesn't help x86)
1675 // - allocates lots of intermediate memory (full size of all components)
1676 // - non-interleaved case requires this anyway
1677 // - allows good upsampling (see next)
1678 // high-quality
1679 // - upsampled channels are bilinearly interpolated, even across blocks
1680 // - quality integer IDCT derived from IJG's 'slow'
1681 // performance
1682 // - fast huffman; reasonable integer IDCT
1683 // - some SIMD kernels for common paths on targets with SSE2/NEON
1684 // - uses a lot of intermediate memory, could cache poorly
1685 
1686 #ifndef STBI_NO_JPEG
1687 
1688 // huffman decoding acceleration
1689 #define FAST_BITS 9 // larger handles more cases; smaller stomps less cache
1690 
1691 typedef struct
1692 {
1693  stbi_uc fast[1 << FAST_BITS];
1694  // weirdly, repacking this into AoS is a 10% speed loss, instead of a win
1695  stbi__uint16 code[256];
1696  stbi_uc values[256];
1697  stbi_uc size[257];
1698  unsigned int maxcode[18];
1699  int delta[17]; // old 'firstsymbol' - old 'firstcode'
1700 } stbi__huffman;
1701 
1702 typedef struct
1703 {
1704  stbi__context* s;
1705  stbi__huffman huff_dc[4];
1706  stbi__huffman huff_ac[4];
1707  stbi__uint16 dequant[4][64];
1708  stbi__int16 fast_ac[4][1 << FAST_BITS];
1709 
1710  // sizes for components, interleaved MCUs
1711  int img_h_max, img_v_max;
1712  int img_mcu_x, img_mcu_y;
1713  int img_mcu_w, img_mcu_h;
1714 
1715  // definition of jpeg image component
1716  struct
1717  {
1718  int id;
1719  int h, v;
1720  int tq;
1721  int hd, ha;
1722  int dc_pred;
1723 
1724  int x, y, w2, h2;
1725  stbi_uc* data;
1726  void* raw_data, * raw_coeff;
1727  stbi_uc* linebuf;
1728  short* coeff; // progressive only
1729  int coeff_w, coeff_h; // number of 8x8 coefficient blocks
1730  } img_comp[4];
1731 
1732  stbi__uint32 code_buffer; // jpeg entropy-coded buffer
1733  int code_bits; // number of valid bits
1734  unsigned char marker; // marker seen while filling entropy buffer
1735  int nomore; // flag if we saw a marker so must stop
1736 
1737  int progressive;
1738  int spec_start;
1739  int spec_end;
1740  int succ_high;
1741  int succ_low;
1742  int eob_run;
1743  int jfif;
1744  int app14_color_transform; // Adobe APP14 tag
1745  int rgb;
1746 
1747  int scan_n, order[4];
1748  int restart_interval, todo;
1749 
1750  // kernels
1751  void(*idct_block_kernel)(stbi_uc* out, int out_stride, short data[64]);
1752  void(*YCbCr_to_RGB_kernel)(stbi_uc* out, const stbi_uc* y, const stbi_uc* pcb, const stbi_uc* pcr, int count, int step);
1753  stbi_uc* (*resample_row_hv_2_kernel)(stbi_uc* out, stbi_uc* in_near, stbi_uc* in_far, int w, int hs);
1754 } stbi__jpeg;
1755 
1756 static int stbi__build_huffman(stbi__huffman* h, int* count)
1757 {
1758  int i, j, k = 0;
1759  unsigned int code;
1760  // build size list for each symbol (from JPEG spec)
1761  for (i = 0; i < 16; ++i)
1762  for (j = 0; j < count[i]; ++j)
1763  h->size[k++] = (stbi_uc)(i + 1);
1764  h->size[k] = 0;
1765 
1766  // compute actual symbols (from jpeg spec)
1767  code = 0;
1768  k = 0;
1769  for (j = 1; j <= 16; ++j) {
1770  // compute delta to add to code to compute symbol id
1771  h->delta[j] = k - code;
1772  if (h->size[k] == j) {
1773  while (h->size[k] == j)
1774  h->code[k++] = (stbi__uint16)(code++);
1775  if (code - 1 >= (1u << j)) return stbi__err("bad code lengths", "Corrupt JPEG");
1776  }
1777  // compute largest code + 1 for this size, preshifted as needed later
1778  h->maxcode[j] = code << (16 - j);
1779  code <<= 1;
1780  }
1781  h->maxcode[j] = 0xffffffff;
1782 
1783  // build non-spec acceleration table; 255 is flag for not-accelerated
1784  memset(h->fast, 255, 1 << FAST_BITS);
1785  for (i = 0; i < k; ++i) {
1786  int s = h->size[i];
1787  if (s <= FAST_BITS) {
1788  int c = h->code[i] << (FAST_BITS - s);
1789  int m = 1 << (FAST_BITS - s);
1790  for (j = 0; j < m; ++j) {
1791  h->fast[c + j] = (stbi_uc)i;
1792  }
1793  }
1794  }
1795  return 1;
1796 }
1797 
1798 // build a table that decodes both magnitude and value of small ACs in
1799 // one go.
1800 static void stbi__build_fast_ac(stbi__int16* fast_ac, stbi__huffman* h)
1801 {
1802  int i;
1803  for (i = 0; i < (1 << FAST_BITS); ++i) {
1804  stbi_uc fast = h->fast[i];
1805  fast_ac[i] = 0;
1806  if (fast < 255) {
1807  int rs = h->values[fast];
1808  int run = (rs >> 4) & 15;
1809  int magbits = rs & 15;
1810  int len = h->size[fast];
1811 
1812  if (magbits && len + magbits <= FAST_BITS) {
1813  // magnitude code followed by receive_extend code
1814  int k = ((i << len) & ((1 << FAST_BITS) - 1)) >> (FAST_BITS - magbits);
1815  int m = 1 << (magbits - 1);
1816  if (k < m) k += (~0U << magbits) + 1;
1817  // if the result is small enough, we can fit it in fast_ac table
1818  if (k >= -128 && k <= 127)
1819  fast_ac[i] = (stbi__int16)((k * 256) + (run * 16) + (len + magbits));
1820  }
1821  }
1822  }
1823 }
1824 
1825 static void stbi__grow_buffer_unsafe(stbi__jpeg* j)
1826 {
1827  do {
1828  unsigned int b = j->nomore ? 0 : stbi__get8(j->s);
1829  if (b == 0xff) {
1830  int c = stbi__get8(j->s);
1831  while (c == 0xff) c = stbi__get8(j->s); // consume fill bytes
1832  if (c != 0) {
1833  j->marker = (unsigned char)c;
1834  j->nomore = 1;
1835  return;
1836  }
1837  }
1838  j->code_buffer |= b << (24 - j->code_bits);
1839  j->code_bits += 8;
1840  } while (j->code_bits <= 24);
1841 }
1842 
1843 // (1 << n) - 1
1844 static const stbi__uint32 stbi__bmask[17] = { 0,1,3,7,15,31,63,127,255,511,1023,2047,4095,8191,16383,32767,65535 };
1845 
1846 // decode a jpeg huffman value from the bitstream
1847 stbi_inline static int stbi__jpeg_huff_decode(stbi__jpeg* j, stbi__huffman* h)
1848 {
1849  unsigned int temp;
1850  int c, k;
1851 
1852  if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1853 
1854  // look at the top FAST_BITS and determine what symbol ID it is,
1855  // if the code is <= FAST_BITS
1856  c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS) - 1);
1857  k = h->fast[c];
1858  if (k < 255) {
1859  int s = h->size[k];
1860  if (s > j->code_bits)
1861  return -1;
1862  j->code_buffer <<= s;
1863  j->code_bits -= s;
1864  return h->values[k];
1865  }
1866 
1867  // naive test is to shift the code_buffer down so k bits are
1868  // valid, then test against maxcode. To speed this up, we've
1869  // preshifted maxcode left so that it has (16-k) 0s at the
1870  // end; in other words, regardless of the number of bits, it
1871  // wants to be compared against something shifted to have 16;
1872  // that way we don't need to shift inside the loop.
1873  temp = j->code_buffer >> 16;
1874  for (k = FAST_BITS + 1; ; ++k)
1875  if (temp < h->maxcode[k])
1876  break;
1877  if (k == 17) {
1878  // error! code not found
1879  j->code_bits -= 16;
1880  return -1;
1881  }
1882 
1883  if (k > j->code_bits)
1884  return -1;
1885 
1886  // convert the huffman code to the symbol id
1887  c = ((j->code_buffer >> (32 - k)) & stbi__bmask[k]) + h->delta[k];
1888  STBI_ASSERT((((j->code_buffer) >> (32 - h->size[c])) & stbi__bmask[h->size[c]]) == h->code[c]);
1889 
1890  // convert the id to a symbol
1891  j->code_bits -= k;
1892  j->code_buffer <<= k;
1893  return h->values[c];
1894 }
1895 
1896 // bias[n] = (-1<<n) + 1
1897 static const int stbi__jbias[16] = { 0,-1,-3,-7,-15,-31,-63,-127,-255,-511,-1023,-2047,-4095,-8191,-16383,-32767 };
1898 
1899 // combined JPEG 'receive' and JPEG 'extend', since baseline
1900 // always extends everything it receives.
1901 stbi_inline static int stbi__extend_receive(stbi__jpeg* j, int n)
1902 {
1903  unsigned int k;
1904  int sgn;
1905  if (j->code_bits < n) stbi__grow_buffer_unsafe(j);
1906 
1907  sgn = (stbi__int32)j->code_buffer >> 31; // sign bit is always in MSB
1908  k = stbi_lrot(j->code_buffer, n);
1909  STBI_ASSERT(n >= 0 && n < (int)(sizeof(stbi__bmask) / sizeof(*stbi__bmask)));
1910  j->code_buffer = k & ~stbi__bmask[n];
1911  k &= stbi__bmask[n];
1912  j->code_bits -= n;
1913  return k + (stbi__jbias[n] & ~sgn);
1914 }
1915 
1916 // get some unsigned bits
1917 stbi_inline static int stbi__jpeg_get_bits(stbi__jpeg* j, int n)
1918 {
1919  unsigned int k;
1920  if (j->code_bits < n) stbi__grow_buffer_unsafe(j);
1921  k = stbi_lrot(j->code_buffer, n);
1922  j->code_buffer = k & ~stbi__bmask[n];
1923  k &= stbi__bmask[n];
1924  j->code_bits -= n;
1925  return k;
1926 }
1927 
1928 stbi_inline static int stbi__jpeg_get_bit(stbi__jpeg* j)
1929 {
1930  unsigned int k;
1931  if (j->code_bits < 1) stbi__grow_buffer_unsafe(j);
1932  k = j->code_buffer;
1933  j->code_buffer <<= 1;
1934  --j->code_bits;
1935  return k & 0x80000000;
1936 }
1937 
1938 // given a value that's at position X in the zigzag stream,
1939 // where does it appear in the 8x8 matrix coded as row-major?
1940 static const stbi_uc stbi__jpeg_dezigzag[64 + 15] =
1941 {
1942  0, 1, 8, 16, 9, 2, 3, 10,
1943  17, 24, 32, 25, 18, 11, 4, 5,
1944  12, 19, 26, 33, 40, 48, 41, 34,
1945  27, 20, 13, 6, 7, 14, 21, 28,
1946  35, 42, 49, 56, 57, 50, 43, 36,
1947  29, 22, 15, 23, 30, 37, 44, 51,
1948  58, 59, 52, 45, 38, 31, 39, 46,
1949  53, 60, 61, 54, 47, 55, 62, 63,
1950  // let corrupt input sample past end
1951  63, 63, 63, 63, 63, 63, 63, 63,
1952  63, 63, 63, 63, 63, 63, 63
1953 };
1954 
1955 // decode one 64-entry block--
1956 static int stbi__jpeg_decode_block(stbi__jpeg* j, short data[64], stbi__huffman* hdc, stbi__huffman* hac, stbi__int16* fac, int b, stbi__uint16* dequant)
1957 {
1958  int diff, dc, k;
1959  int t;
1960 
1961  if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1962  t = stbi__jpeg_huff_decode(j, hdc);
1963  if (t < 0) return stbi__err("bad huffman code", "Corrupt JPEG");
1964 
1965  // 0 all the ac values now so we can do it 32-bits at a time
1966  memset(data, 0, 64 * sizeof(data[0]));
1967 
1968  diff = t ? stbi__extend_receive(j, t) : 0;
1969  dc = j->img_comp[b].dc_pred + diff;
1970  j->img_comp[b].dc_pred = dc;
1971  data[0] = (short)(dc * dequant[0]);
1972 
1973  // decode AC components, see JPEG spec
1974  k = 1;
1975  do {
1976  unsigned int zig;
1977  int c, r, s;
1978  if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1979  c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS) - 1);
1980  r = fac[c];
1981  if (r) { // fast-AC path
1982  k += (r >> 4) & 15; // run
1983  s = r & 15; // combined length
1984  j->code_buffer <<= s;
1985  j->code_bits -= s;
1986  // decode into unzigzag'd location
1987  zig = stbi__jpeg_dezigzag[k++];
1988  data[zig] = (short)((r >> 8) * dequant[zig]);
1989  }
1990  else {
1991  int rs = stbi__jpeg_huff_decode(j, hac);
1992  if (rs < 0) return stbi__err("bad huffman code", "Corrupt JPEG");
1993  s = rs & 15;
1994  r = rs >> 4;
1995  if (s == 0) {
1996  if (rs != 0xf0) break; // end block
1997  k += 16;
1998  }
1999  else {
2000  k += r;
2001  // decode into unzigzag'd location
2002  zig = stbi__jpeg_dezigzag[k++];
2003  data[zig] = (short)(stbi__extend_receive(j, s) * dequant[zig]);
2004  }
2005  }
2006  } while (k < 64);
2007  return 1;
2008 }
2009 
2010 static int stbi__jpeg_decode_block_prog_dc(stbi__jpeg* j, short data[64], stbi__huffman* hdc, int b)
2011 {
2012  int diff, dc;
2013  int t;
2014  if (j->spec_end != 0) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
2015 
2016  if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
2017 
2018  if (j->succ_high == 0) {
2019  // first scan for DC coefficient, must be first
2020  memset(data, 0, 64 * sizeof(data[0])); // 0 all the ac values now
2021  t = stbi__jpeg_huff_decode(j, hdc);
2022  diff = t ? stbi__extend_receive(j, t) : 0;
2023 
2024  dc = j->img_comp[b].dc_pred + diff;
2025  j->img_comp[b].dc_pred = dc;
2026  data[0] = (short)(dc << j->succ_low);
2027  }
2028  else {
2029  // refinement scan for DC coefficient
2030  if (stbi__jpeg_get_bit(j))
2031  data[0] += (short)(1 << j->succ_low);
2032  }
2033  return 1;
2034 }
2035 
2036 // @OPTIMIZE: store non-zigzagged during the decode passes,
2037 // and only de-zigzag when dequantizing
2038 static int stbi__jpeg_decode_block_prog_ac(stbi__jpeg* j, short data[64], stbi__huffman* hac, stbi__int16* fac)
2039 {
2040  int k;
2041  if (j->spec_start == 0) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
2042 
2043  if (j->succ_high == 0) {
2044  int shift = j->succ_low;
2045 
2046  if (j->eob_run) {
2047  --j->eob_run;
2048  return 1;
2049  }
2050 
2051  k = j->spec_start;
2052  do {
2053  unsigned int zig;
2054  int c, r, s;
2055  if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
2056  c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS) - 1);
2057  r = fac[c];
2058  if (r) { // fast-AC path
2059  k += (r >> 4) & 15; // run
2060  s = r & 15; // combined length
2061  j->code_buffer <<= s;
2062  j->code_bits -= s;
2063  zig = stbi__jpeg_dezigzag[k++];
2064  data[zig] = (short)((r >> 8) << shift);
2065  }
2066  else {
2067  int rs = stbi__jpeg_huff_decode(j, hac);
2068  if (rs < 0) return stbi__err("bad huffman code", "Corrupt JPEG");
2069  s = rs & 15;
2070  r = rs >> 4;
2071  if (s == 0) {
2072  if (r < 15) {
2073  j->eob_run = (1 << r);
2074  if (r)
2075  j->eob_run += stbi__jpeg_get_bits(j, r);
2076  --j->eob_run;
2077  break;
2078  }
2079  k += 16;
2080  }
2081  else {
2082  k += r;
2083  zig = stbi__jpeg_dezigzag[k++];
2084  data[zig] = (short)(stbi__extend_receive(j, s) << shift);
2085  }
2086  }
2087  } while (k <= j->spec_end);
2088  }
2089  else {
2090  // refinement scan for these AC coefficients
2091 
2092  short bit = (short)(1 << j->succ_low);
2093 
2094  if (j->eob_run) {
2095  --j->eob_run;
2096  for (k = j->spec_start; k <= j->spec_end; ++k) {
2097  short* p = &data[stbi__jpeg_dezigzag[k]];
2098  if (*p != 0)
2099  if (stbi__jpeg_get_bit(j))
2100  if ((*p & bit) == 0) {
2101  if (*p > 0)
2102  * p += bit;
2103  else
2104  *p -= bit;
2105  }
2106  }
2107  }
2108  else {
2109  k = j->spec_start;
2110  do {
2111  int r, s;
2112  int rs = stbi__jpeg_huff_decode(j, hac); // @OPTIMIZE see if we can use the fast path here, advance-by-r is so slow, eh
2113  if (rs < 0) return stbi__err("bad huffman code", "Corrupt JPEG");
2114  s = rs & 15;
2115  r = rs >> 4;
2116  if (s == 0) {
2117  if (r < 15) {
2118  j->eob_run = (1 << r) - 1;
2119  if (r)
2120  j->eob_run += stbi__jpeg_get_bits(j, r);
2121  r = 64; // force end of block
2122  }
2123  else {
2124  // r=15 s=0 should write 16 0s, so we just do
2125  // a run of 15 0s and then write s (which is 0),
2126  // so we don't have to do anything special here
2127  }
2128  }
2129  else {
2130  if (s != 1) return stbi__err("bad huffman code", "Corrupt JPEG");
2131  // sign bit
2132  if (stbi__jpeg_get_bit(j))
2133  s = bit;
2134  else
2135  s = -bit;
2136  }
2137 
2138  // advance by r
2139  while (k <= j->spec_end) {
2140  short* p = &data[stbi__jpeg_dezigzag[k++]];
2141  if (*p != 0) {
2142  if (stbi__jpeg_get_bit(j))
2143  if ((*p & bit) == 0) {
2144  if (*p > 0)
2145  * p += bit;
2146  else
2147  *p -= bit;
2148  }
2149  }
2150  else {
2151  if (r == 0) {
2152  *p = (short)s;
2153  break;
2154  }
2155  --r;
2156  }
2157  }
2158  } while (k <= j->spec_end);
2159  }
2160  }
2161  return 1;
2162 }
2163 
2164 // take a -128..127 value and stbi__clamp it and convert to 0..255
2165 stbi_inline static stbi_uc stbi__clamp(int x)
2166 {
2167  // trick to use a single test to catch both cases
2168  if ((unsigned int)x > 255) {
2169  if (x < 0) return 0;
2170  if (x > 255) return 255;
2171  }
2172  return (stbi_uc)x;
2173 }
2174 
2175 #define stbi__f2f(x) ((int) (((x) * 4096 + 0.5)))
2176 #define stbi__fsh(x) ((x) * 4096)
2177 
2178 // derived from jidctint -- DCT_ISLOW
2179 #define STBI__IDCT_1D(s0,s1,s2,s3,s4,s5,s6,s7) \
2180  int t0,t1,t2,t3,p1,p2,p3,p4,p5,x0,x1,x2,x3; \
2181  p2 = s2; \
2182  p3 = s6; \
2183  p1 = (p2+p3) * stbi__f2f(0.5411961f); \
2184  t2 = p1 + p3*stbi__f2f(-1.847759065f); \
2185  t3 = p1 + p2*stbi__f2f( 0.765366865f); \
2186  p2 = s0; \
2187  p3 = s4; \
2188  t0 = stbi__fsh(p2+p3); \
2189  t1 = stbi__fsh(p2-p3); \
2190  x0 = t0+t3; \
2191  x3 = t0-t3; \
2192  x1 = t1+t2; \
2193  x2 = t1-t2; \
2194  t0 = s7; \
2195  t1 = s5; \
2196  t2 = s3; \
2197  t3 = s1; \
2198  p3 = t0+t2; \
2199  p4 = t1+t3; \
2200  p1 = t0+t3; \
2201  p2 = t1+t2; \
2202  p5 = (p3+p4)*stbi__f2f( 1.175875602f); \
2203  t0 = t0*stbi__f2f( 0.298631336f); \
2204  t1 = t1*stbi__f2f( 2.053119869f); \
2205  t2 = t2*stbi__f2f( 3.072711026f); \
2206  t3 = t3*stbi__f2f( 1.501321110f); \
2207  p1 = p5 + p1*stbi__f2f(-0.899976223f); \
2208  p2 = p5 + p2*stbi__f2f(-2.562915447f); \
2209  p3 = p3*stbi__f2f(-1.961570560f); \
2210  p4 = p4*stbi__f2f(-0.390180644f); \
2211  t3 += p1+p4; \
2212  t2 += p2+p3; \
2213  t1 += p2+p4; \
2214  t0 += p1+p3;
2215 
2216 static void stbi__idct_block(stbi_uc* out, int out_stride, short data[64])
2217 {
2218  int i, val[64], * v = val;
2219  stbi_uc* o;
2220  short* d = data;
2221 
2222  // columns
2223  for (i = 0; i < 8; ++i, ++d, ++v) {
2224  // if all zeroes, shortcut -- this avoids dequantizing 0s and IDCTing
2225  if (d[8] == 0 && d[16] == 0 && d[24] == 0 && d[32] == 0
2226  && d[40] == 0 && d[48] == 0 && d[56] == 0) {
2227  // no shortcut 0 seconds
2228  // (1|2|3|4|5|6|7)==0 0 seconds
2229  // all separate -0.047 seconds
2230  // 1 && 2|3 && 4|5 && 6|7: -0.047 seconds
2231  int dcterm = d[0] * 4;
2232  v[0] = v[8] = v[16] = v[24] = v[32] = v[40] = v[48] = v[56] = dcterm;
2233  }
2234  else {
2235  STBI__IDCT_1D(d[0], d[8], d[16], d[24], d[32], d[40], d[48], d[56])
2236  // constants scaled things up by 1<<12; let's bring them back
2237  // down, but keep 2 extra bits of precision
2238  x0 += 512; x1 += 512; x2 += 512; x3 += 512;
2239  v[0] = (x0 + t3) >> 10;
2240  v[56] = (x0 - t3) >> 10;
2241  v[8] = (x1 + t2) >> 10;
2242  v[48] = (x1 - t2) >> 10;
2243  v[16] = (x2 + t1) >> 10;
2244  v[40] = (x2 - t1) >> 10;
2245  v[24] = (x3 + t0) >> 10;
2246  v[32] = (x3 - t0) >> 10;
2247  }
2248  }
2249 
2250  for (i = 0, v = val, o = out; i < 8; ++i, v += 8, o += out_stride) {
2251  // no fast case since the first 1D IDCT spread components out
2252  STBI__IDCT_1D(v[0], v[1], v[2], v[3], v[4], v[5], v[6], v[7])
2253  // constants scaled things up by 1<<12, plus we had 1<<2 from first
2254  // loop, plus horizontal and vertical each scale by sqrt(8) so together
2255  // we've got an extra 1<<3, so 1<<17 total we need to remove.
2256  // so we want to round that, which means adding 0.5 * 1<<17,
2257  // aka 65536. Also, we'll end up with -128 to 127 that we want
2258  // to encode as 0..255 by adding 128, so we'll add that before the shift
2259  x0 += 65536 + (128 << 17);
2260  x1 += 65536 + (128 << 17);
2261  x2 += 65536 + (128 << 17);
2262  x3 += 65536 + (128 << 17);
2263  // tried computing the shifts into temps, or'ing the temps to see
2264  // if any were out of range, but that was slower
2265  o[0] = stbi__clamp((x0 + t3) >> 17);
2266  o[7] = stbi__clamp((x0 - t3) >> 17);
2267  o[1] = stbi__clamp((x1 + t2) >> 17);
2268  o[6] = stbi__clamp((x1 - t2) >> 17);
2269  o[2] = stbi__clamp((x2 + t1) >> 17);
2270  o[5] = stbi__clamp((x2 - t1) >> 17);
2271  o[3] = stbi__clamp((x3 + t0) >> 17);
2272  o[4] = stbi__clamp((x3 - t0) >> 17);
2273  }
2274 }
2275 
2276 #ifdef STBI_SSE2
2277 // sse2 integer IDCT. not the fastest possible implementation but it
2278 // produces bit-identical results to the generic C version so it's
2279 // fully "transparent".
2280 static void stbi__idct_simd(stbi_uc* out, int out_stride, short data[64])
2281 {
2282  // This is constructed to match our regular (generic) integer IDCT exactly.
2283  __m128i row0, row1, row2, row3, row4, row5, row6, row7;
2284  __m128i tmp;
2285 
2286  // dot product constant: even elems=x, odd elems=y
2287 #define dct_const(x,y) _mm_setr_epi16((x),(y),(x),(y),(x),(y),(x),(y))
2288 
2289  // out(0) = c0[even]*x + c0[odd]*y (c0, x, y 16-bit, out 32-bit)
2290  // out(1) = c1[even]*x + c1[odd]*y
2291 #define dct_rot(out0,out1, x,y,c0,c1) \
2292  __m128i c0##lo = _mm_unpacklo_epi16((x),(y)); \
2293  __m128i c0##hi = _mm_unpackhi_epi16((x),(y)); \
2294  __m128i out0##_l = _mm_madd_epi16(c0##lo, c0); \
2295  __m128i out0##_h = _mm_madd_epi16(c0##hi, c0); \
2296  __m128i out1##_l = _mm_madd_epi16(c0##lo, c1); \
2297  __m128i out1##_h = _mm_madd_epi16(c0##hi, c1)
2298 
2299  // out = in << 12 (in 16-bit, out 32-bit)
2300 #define dct_widen(out, in) \
2301  __m128i out##_l = _mm_srai_epi32(_mm_unpacklo_epi16(_mm_setzero_si128(), (in)), 4); \
2302  __m128i out##_h = _mm_srai_epi32(_mm_unpackhi_epi16(_mm_setzero_si128(), (in)), 4)
2303 
2304  // wide add
2305 #define dct_wadd(out, a, b) \
2306  __m128i out##_l = _mm_add_epi32(a##_l, b##_l); \
2307  __m128i out##_h = _mm_add_epi32(a##_h, b##_h)
2308 
2309  // wide sub
2310 #define dct_wsub(out, a, b) \
2311  __m128i out##_l = _mm_sub_epi32(a##_l, b##_l); \
2312  __m128i out##_h = _mm_sub_epi32(a##_h, b##_h)
2313 
2314  // butterfly a/b, add bias, then shift by "s" and pack
2315 #define dct_bfly32o(out0, out1, a,b,bias,s) \
2316  { \
2317  __m128i abiased_l = _mm_add_epi32(a##_l, bias); \
2318  __m128i abiased_h = _mm_add_epi32(a##_h, bias); \
2319  dct_wadd(sum, abiased, b); \
2320  dct_wsub(dif, abiased, b); \
2321  out0 = _mm_packs_epi32(_mm_srai_epi32(sum_l, s), _mm_srai_epi32(sum_h, s)); \
2322  out1 = _mm_packs_epi32(_mm_srai_epi32(dif_l, s), _mm_srai_epi32(dif_h, s)); \
2323  }
2324 
2325  // 8-bit interleave step (for transposes)
2326 #define dct_interleave8(a, b) \
2327  tmp = a; \
2328  a = _mm_unpacklo_epi8(a, b); \
2329  b = _mm_unpackhi_epi8(tmp, b)
2330 
2331  // 16-bit interleave step (for transposes)
2332 #define dct_interleave16(a, b) \
2333  tmp = a; \
2334  a = _mm_unpacklo_epi16(a, b); \
2335  b = _mm_unpackhi_epi16(tmp, b)
2336 
2337 #define dct_pass(bias,shift) \
2338  { \
2339  /* even part */ \
2340  dct_rot(t2e,t3e, row2,row6, rot0_0,rot0_1); \
2341  __m128i sum04 = _mm_add_epi16(row0, row4); \
2342  __m128i dif04 = _mm_sub_epi16(row0, row4); \
2343  dct_widen(t0e, sum04); \
2344  dct_widen(t1e, dif04); \
2345  dct_wadd(x0, t0e, t3e); \
2346  dct_wsub(x3, t0e, t3e); \
2347  dct_wadd(x1, t1e, t2e); \
2348  dct_wsub(x2, t1e, t2e); \
2349  /* odd part */ \
2350  dct_rot(y0o,y2o, row7,row3, rot2_0,rot2_1); \
2351  dct_rot(y1o,y3o, row5,row1, rot3_0,rot3_1); \
2352  __m128i sum17 = _mm_add_epi16(row1, row7); \
2353  __m128i sum35 = _mm_add_epi16(row3, row5); \
2354  dct_rot(y4o,y5o, sum17,sum35, rot1_0,rot1_1); \
2355  dct_wadd(x4, y0o, y4o); \
2356  dct_wadd(x5, y1o, y5o); \
2357  dct_wadd(x6, y2o, y5o); \
2358  dct_wadd(x7, y3o, y4o); \
2359  dct_bfly32o(row0,row7, x0,x7,bias,shift); \
2360  dct_bfly32o(row1,row6, x1,x6,bias,shift); \
2361  dct_bfly32o(row2,row5, x2,x5,bias,shift); \
2362  dct_bfly32o(row3,row4, x3,x4,bias,shift); \
2363  }
2364 
2365  __m128i rot0_0 = dct_const(stbi__f2f(0.5411961f), stbi__f2f(0.5411961f) + stbi__f2f(-1.847759065f));
2366  __m128i rot0_1 = dct_const(stbi__f2f(0.5411961f) + stbi__f2f(0.765366865f), stbi__f2f(0.5411961f));
2367  __m128i rot1_0 = dct_const(stbi__f2f(1.175875602f) + stbi__f2f(-0.899976223f), stbi__f2f(1.175875602f));
2368  __m128i rot1_1 = dct_const(stbi__f2f(1.175875602f), stbi__f2f(1.175875602f) + stbi__f2f(-2.562915447f));
2369  __m128i rot2_0 = dct_const(stbi__f2f(-1.961570560f) + stbi__f2f(0.298631336f), stbi__f2f(-1.961570560f));
2370  __m128i rot2_1 = dct_const(stbi__f2f(-1.961570560f), stbi__f2f(-1.961570560f) + stbi__f2f(3.072711026f));
2371  __m128i rot3_0 = dct_const(stbi__f2f(-0.390180644f) + stbi__f2f(2.053119869f), stbi__f2f(-0.390180644f));
2372  __m128i rot3_1 = dct_const(stbi__f2f(-0.390180644f), stbi__f2f(-0.390180644f) + stbi__f2f(1.501321110f));
2373 
2374  // rounding biases in column/row passes, see stbi__idct_block for explanation.
2375  __m128i bias_0 = _mm_set1_epi32(512);
2376  __m128i bias_1 = _mm_set1_epi32(65536 + (128 << 17));
2377 
2378  // load
2379  row0 = _mm_load_si128((const __m128i*) (data + 0 * 8));
2380  row1 = _mm_load_si128((const __m128i*) (data + 1 * 8));
2381  row2 = _mm_load_si128((const __m128i*) (data + 2 * 8));
2382  row3 = _mm_load_si128((const __m128i*) (data + 3 * 8));
2383  row4 = _mm_load_si128((const __m128i*) (data + 4 * 8));
2384  row5 = _mm_load_si128((const __m128i*) (data + 5 * 8));
2385  row6 = _mm_load_si128((const __m128i*) (data + 6 * 8));
2386  row7 = _mm_load_si128((const __m128i*) (data + 7 * 8));
2387 
2388  // column pass
2389  dct_pass(bias_0, 10);
2390 
2391  {
2392  // 16bit 8x8 transpose pass 1
2393  dct_interleave16(row0, row4);
2394  dct_interleave16(row1, row5);
2395  dct_interleave16(row2, row6);
2396  dct_interleave16(row3, row7);
2397 
2398  // transpose pass 2
2399  dct_interleave16(row0, row2);
2400  dct_interleave16(row1, row3);
2401  dct_interleave16(row4, row6);
2402  dct_interleave16(row5, row7);
2403 
2404  // transpose pass 3
2405  dct_interleave16(row0, row1);
2406  dct_interleave16(row2, row3);
2407  dct_interleave16(row4, row5);
2408  dct_interleave16(row6, row7);
2409  }
2410 
2411  // row pass
2412  dct_pass(bias_1, 17);
2413 
2414  {
2415  // pack
2416  __m128i p0 = _mm_packus_epi16(row0, row1); // a0a1a2a3...a7b0b1b2b3...b7
2417  __m128i p1 = _mm_packus_epi16(row2, row3);
2418  __m128i p2 = _mm_packus_epi16(row4, row5);
2419  __m128i p3 = _mm_packus_epi16(row6, row7);
2420 
2421  // 8bit 8x8 transpose pass 1
2422  dct_interleave8(p0, p2); // a0e0a1e1...
2423  dct_interleave8(p1, p3); // c0g0c1g1...
2424 
2425  // transpose pass 2
2426  dct_interleave8(p0, p1); // a0c0e0g0...
2427  dct_interleave8(p2, p3); // b0d0f0h0...
2428 
2429  // transpose pass 3
2430  dct_interleave8(p0, p2); // a0b0c0d0...
2431  dct_interleave8(p1, p3); // a4b4c4d4...
2432 
2433  // store
2434  _mm_storel_epi64((__m128i*) out, p0); out += out_stride;
2435  _mm_storel_epi64((__m128i*) out, _mm_shuffle_epi32(p0, 0x4e)); out += out_stride;
2436  _mm_storel_epi64((__m128i*) out, p2); out += out_stride;
2437  _mm_storel_epi64((__m128i*) out, _mm_shuffle_epi32(p2, 0x4e)); out += out_stride;
2438  _mm_storel_epi64((__m128i*) out, p1); out += out_stride;
2439  _mm_storel_epi64((__m128i*) out, _mm_shuffle_epi32(p1, 0x4e)); out += out_stride;
2440  _mm_storel_epi64((__m128i*) out, p3); out += out_stride;
2441  _mm_storel_epi64((__m128i*) out, _mm_shuffle_epi32(p3, 0x4e));
2442  }
2443 
2444 #undef dct_const
2445 #undef dct_rot
2446 #undef dct_widen
2447 #undef dct_wadd
2448 #undef dct_wsub
2449 #undef dct_bfly32o
2450 #undef dct_interleave8
2451 #undef dct_interleave16
2452 #undef dct_pass
2453 }
2454 
2455 #endif // STBI_SSE2
2456 
2457 #ifdef STBI_NEON
2458 
2459 // NEON integer IDCT. should produce bit-identical
2460 // results to the generic C version.
2461 static void stbi__idct_simd(stbi_uc* out, int out_stride, short data[64])
2462 {
2463  int16x8_t row0, row1, row2, row3, row4, row5, row6, row7;
2464 
2465  int16x4_t rot0_0 = vdup_n_s16(stbi__f2f(0.5411961f));
2466  int16x4_t rot0_1 = vdup_n_s16(stbi__f2f(-1.847759065f));
2467  int16x4_t rot0_2 = vdup_n_s16(stbi__f2f(0.765366865f));
2468  int16x4_t rot1_0 = vdup_n_s16(stbi__f2f(1.175875602f));
2469  int16x4_t rot1_1 = vdup_n_s16(stbi__f2f(-0.899976223f));
2470  int16x4_t rot1_2 = vdup_n_s16(stbi__f2f(-2.562915447f));
2471  int16x4_t rot2_0 = vdup_n_s16(stbi__f2f(-1.961570560f));
2472  int16x4_t rot2_1 = vdup_n_s16(stbi__f2f(-0.390180644f));
2473  int16x4_t rot3_0 = vdup_n_s16(stbi__f2f(0.298631336f));
2474  int16x4_t rot3_1 = vdup_n_s16(stbi__f2f(2.053119869f));
2475  int16x4_t rot3_2 = vdup_n_s16(stbi__f2f(3.072711026f));
2476  int16x4_t rot3_3 = vdup_n_s16(stbi__f2f(1.501321110f));
2477 
2478 #define dct_long_mul(out, inq, coeff) \
2479  int32x4_t out##_l = vmull_s16(vget_low_s16(inq), coeff); \
2480  int32x4_t out##_h = vmull_s16(vget_high_s16(inq), coeff)
2481 
2482 #define dct_long_mac(out, acc, inq, coeff) \
2483  int32x4_t out##_l = vmlal_s16(acc##_l, vget_low_s16(inq), coeff); \
2484  int32x4_t out##_h = vmlal_s16(acc##_h, vget_high_s16(inq), coeff)
2485 
2486 #define dct_widen(out, inq) \
2487  int32x4_t out##_l = vshll_n_s16(vget_low_s16(inq), 12); \
2488  int32x4_t out##_h = vshll_n_s16(vget_high_s16(inq), 12)
2489 
2490  // wide add
2491 #define dct_wadd(out, a, b) \
2492  int32x4_t out##_l = vaddq_s32(a##_l, b##_l); \
2493  int32x4_t out##_h = vaddq_s32(a##_h, b##_h)
2494 
2495  // wide sub
2496 #define dct_wsub(out, a, b) \
2497  int32x4_t out##_l = vsubq_s32(a##_l, b##_l); \
2498  int32x4_t out##_h = vsubq_s32(a##_h, b##_h)
2499 
2500  // butterfly a/b, then shift using "shiftop" by "s" and pack
2501 #define dct_bfly32o(out0,out1, a,b,shiftop,s) \
2502  { \
2503  dct_wadd(sum, a, b); \
2504  dct_wsub(dif, a, b); \
2505  out0 = vcombine_s16(shiftop(sum_l, s), shiftop(sum_h, s)); \
2506  out1 = vcombine_s16(shiftop(dif_l, s), shiftop(dif_h, s)); \
2507  }
2508 
2509 #define dct_pass(shiftop, shift) \
2510  { \
2511  /* even part */ \
2512  int16x8_t sum26 = vaddq_s16(row2, row6); \
2513  dct_long_mul(p1e, sum26, rot0_0); \
2514  dct_long_mac(t2e, p1e, row6, rot0_1); \
2515  dct_long_mac(t3e, p1e, row2, rot0_2); \
2516  int16x8_t sum04 = vaddq_s16(row0, row4); \
2517  int16x8_t dif04 = vsubq_s16(row0, row4); \
2518  dct_widen(t0e, sum04); \
2519  dct_widen(t1e, dif04); \
2520  dct_wadd(x0, t0e, t3e); \
2521  dct_wsub(x3, t0e, t3e); \
2522  dct_wadd(x1, t1e, t2e); \
2523  dct_wsub(x2, t1e, t2e); \
2524  /* odd part */ \
2525  int16x8_t sum15 = vaddq_s16(row1, row5); \
2526  int16x8_t sum17 = vaddq_s16(row1, row7); \
2527  int16x8_t sum35 = vaddq_s16(row3, row5); \
2528  int16x8_t sum37 = vaddq_s16(row3, row7); \
2529  int16x8_t sumodd = vaddq_s16(sum17, sum35); \
2530  dct_long_mul(p5o, sumodd, rot1_0); \
2531  dct_long_mac(p1o, p5o, sum17, rot1_1); \
2532  dct_long_mac(p2o, p5o, sum35, rot1_2); \
2533  dct_long_mul(p3o, sum37, rot2_0); \
2534  dct_long_mul(p4o, sum15, rot2_1); \
2535  dct_wadd(sump13o, p1o, p3o); \
2536  dct_wadd(sump24o, p2o, p4o); \
2537  dct_wadd(sump23o, p2o, p3o); \
2538  dct_wadd(sump14o, p1o, p4o); \
2539  dct_long_mac(x4, sump13o, row7, rot3_0); \
2540  dct_long_mac(x5, sump24o, row5, rot3_1); \
2541  dct_long_mac(x6, sump23o, row3, rot3_2); \
2542  dct_long_mac(x7, sump14o, row1, rot3_3); \
2543  dct_bfly32o(row0,row7, x0,x7,shiftop,shift); \
2544  dct_bfly32o(row1,row6, x1,x6,shiftop,shift); \
2545  dct_bfly32o(row2,row5, x2,x5,shiftop,shift); \
2546  dct_bfly32o(row3,row4, x3,x4,shiftop,shift); \
2547  }
2548 
2549  // load
2550  row0 = vld1q_s16(data + 0 * 8);
2551  row1 = vld1q_s16(data + 1 * 8);
2552  row2 = vld1q_s16(data + 2 * 8);
2553  row3 = vld1q_s16(data + 3 * 8);
2554  row4 = vld1q_s16(data + 4 * 8);
2555  row5 = vld1q_s16(data + 5 * 8);
2556  row6 = vld1q_s16(data + 6 * 8);
2557  row7 = vld1q_s16(data + 7 * 8);
2558 
2559  // add DC bias
2560  row0 = vaddq_s16(row0, vsetq_lane_s16(1024, vdupq_n_s16(0), 0));
2561 
2562  // column pass
2563  dct_pass(vrshrn_n_s32, 10);
2564 
2565  // 16bit 8x8 transpose
2566  {
2567  // these three map to a single VTRN.16, VTRN.32, and VSWP, respectively.
2568  // whether compilers actually get this is another story, sadly.
2569 #define dct_trn16(x, y) { int16x8x2_t t = vtrnq_s16(x, y); x = t.val[0]; y = t.val[1]; }
2570 #define dct_trn32(x, y) { int32x4x2_t t = vtrnq_s32(vreinterpretq_s32_s16(x), vreinterpretq_s32_s16(y)); x = vreinterpretq_s16_s32(t.val[0]); y = vreinterpretq_s16_s32(t.val[1]); }
2571 #define dct_trn64(x, y) { int16x8_t x0 = x; int16x8_t y0 = y; x = vcombine_s16(vget_low_s16(x0), vget_low_s16(y0)); y = vcombine_s16(vget_high_s16(x0), vget_high_s16(y0)); }
2572 
2573  // pass 1
2574  dct_trn16(row0, row1); // a0b0a2b2a4b4a6b6
2575  dct_trn16(row2, row3);
2576  dct_trn16(row4, row5);
2577  dct_trn16(row6, row7);
2578 
2579  // pass 2
2580  dct_trn32(row0, row2); // a0b0c0d0a4b4c4d4
2581  dct_trn32(row1, row3);
2582  dct_trn32(row4, row6);
2583  dct_trn32(row5, row7);
2584 
2585  // pass 3
2586  dct_trn64(row0, row4); // a0b0c0d0e0f0g0h0
2587  dct_trn64(row1, row5);
2588  dct_trn64(row2, row6);
2589  dct_trn64(row3, row7);
2590 
2591 #undef dct_trn16
2592 #undef dct_trn32
2593 #undef dct_trn64
2594  }
2595 
2596  // row pass
2597  // vrshrn_n_s32 only supports shifts up to 16, we need
2598  // 17. so do a non-rounding shift of 16 first then follow
2599  // up with a rounding shift by 1.
2600  dct_pass(vshrn_n_s32, 16);
2601 
2602  {
2603  // pack and round
2604  uint8x8_t p0 = vqrshrun_n_s16(row0, 1);
2605  uint8x8_t p1 = vqrshrun_n_s16(row1, 1);
2606  uint8x8_t p2 = vqrshrun_n_s16(row2, 1);
2607  uint8x8_t p3 = vqrshrun_n_s16(row3, 1);
2608  uint8x8_t p4 = vqrshrun_n_s16(row4, 1);
2609  uint8x8_t p5 = vqrshrun_n_s16(row5, 1);
2610  uint8x8_t p6 = vqrshrun_n_s16(row6, 1);
2611  uint8x8_t p7 = vqrshrun_n_s16(row7, 1);
2612 
2613  // again, these can translate into one instruction, but often don't.
2614 #define dct_trn8_8(x, y) { uint8x8x2_t t = vtrn_u8(x, y); x = t.val[0]; y = t.val[1]; }
2615 #define dct_trn8_16(x, y) { uint16x4x2_t t = vtrn_u16(vreinterpret_u16_u8(x), vreinterpret_u16_u8(y)); x = vreinterpret_u8_u16(t.val[0]); y = vreinterpret_u8_u16(t.val[1]); }
2616 #define dct_trn8_32(x, y) { uint32x2x2_t t = vtrn_u32(vreinterpret_u32_u8(x), vreinterpret_u32_u8(y)); x = vreinterpret_u8_u32(t.val[0]); y = vreinterpret_u8_u32(t.val[1]); }
2617 
2618  // sadly can't use interleaved stores here since we only write
2619  // 8 bytes to each scan line!
2620 
2621  // 8x8 8-bit transpose pass 1
2622  dct_trn8_8(p0, p1);
2623  dct_trn8_8(p2, p3);
2624  dct_trn8_8(p4, p5);
2625  dct_trn8_8(p6, p7);
2626 
2627  // pass 2
2628  dct_trn8_16(p0, p2);
2629  dct_trn8_16(p1, p3);
2630  dct_trn8_16(p4, p6);
2631  dct_trn8_16(p5, p7);
2632 
2633  // pass 3
2634  dct_trn8_32(p0, p4);
2635  dct_trn8_32(p1, p5);
2636  dct_trn8_32(p2, p6);
2637  dct_trn8_32(p3, p7);
2638 
2639  // store
2640  vst1_u8(out, p0); out += out_stride;
2641  vst1_u8(out, p1); out += out_stride;
2642  vst1_u8(out, p2); out += out_stride;
2643  vst1_u8(out, p3); out += out_stride;
2644  vst1_u8(out, p4); out += out_stride;
2645  vst1_u8(out, p5); out += out_stride;
2646  vst1_u8(out, p6); out += out_stride;
2647  vst1_u8(out, p7);
2648 
2649 #undef dct_trn8_8
2650 #undef dct_trn8_16
2651 #undef dct_trn8_32
2652  }
2653 
2654 #undef dct_long_mul
2655 #undef dct_long_mac
2656 #undef dct_widen
2657 #undef dct_wadd
2658 #undef dct_wsub
2659 #undef dct_bfly32o
2660 #undef dct_pass
2661 }
2662 
2663 #endif // STBI_NEON
2664 
2665 #define STBI__MARKER_none 0xff
2666 // if there's a pending marker from the entropy stream, return that
2667 // otherwise, fetch from the stream and get a marker. if there's no
2668 // marker, return 0xff, which is never a valid marker value
2669 static stbi_uc stbi__get_marker(stbi__jpeg * j)
2670 {
2671  stbi_uc x;
2672  if (j->marker != STBI__MARKER_none) { x = j->marker; j->marker = STBI__MARKER_none; return x; }
2673  x = stbi__get8(j->s);
2674  if (x != 0xff) return STBI__MARKER_none;
2675  while (x == 0xff)
2676  x = stbi__get8(j->s); // consume repeated 0xff fill bytes
2677  return x;
2678 }
2679 
2680 // in each scan, we'll have scan_n components, and the order
2681 // of the components is specified by order[]
2682 #define STBI__RESTART(x) ((x) >= 0xd0 && (x) <= 0xd7)
2683 
2684 // after a restart interval, stbi__jpeg_reset the entropy decoder and
2685 // the dc prediction
2686 static void stbi__jpeg_reset(stbi__jpeg* j)
2687 {
2688  j->code_bits = 0;
2689  j->code_buffer = 0;
2690  j->nomore = 0;
2691  j->img_comp[0].dc_pred = j->img_comp[1].dc_pred = j->img_comp[2].dc_pred = j->img_comp[3].dc_pred = 0;
2692  j->marker = STBI__MARKER_none;
2693  j->todo = j->restart_interval ? j->restart_interval : 0x7fffffff;
2694  j->eob_run = 0;
2695  // no more than 1<<31 MCUs if no restart_interal? that's plenty safe,
2696  // since we don't even allow 1<<30 pixels
2697 }
2698 
2699 static int stbi__parse_entropy_coded_data(stbi__jpeg* z)
2700 {
2701  stbi__jpeg_reset(z);
2702  if (!z->progressive) {
2703  if (z->scan_n == 1) {
2704  int i, j;
2705  STBI_SIMD_ALIGN(short, data[64]);
2706  int n = z->order[0];
2707  // non-interleaved data, we just need to process one block at a time,
2708  // in trivial scanline order
2709  // number of blocks to do just depends on how many actual "pixels" this
2710  // component has, independent of interleaved MCU blocking and such
2711  int w = (z->img_comp[n].x + 7) >> 3;
2712  int h = (z->img_comp[n].y + 7) >> 3;
2713  for (j = 0; j < h; ++j) {
2714  for (i = 0; i < w; ++i) {
2715  int ha = z->img_comp[n].ha;
2716  if (!stbi__jpeg_decode_block(z, data, z->huff_dc + z->img_comp[n].hd, z->huff_ac + ha, z->fast_ac[ha], n, z->dequant[z->img_comp[n].tq])) return 0;
2717  z->idct_block_kernel(z->img_comp[n].data + z->img_comp[n].w2 * j * 8 + i * 8, z->img_comp[n].w2, data);
2718  // every data block is an MCU, so countdown the restart interval
2719  if (--z->todo <= 0) {
2720  if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2721  // if it's NOT a restart, then just bail, so we get corrupt data
2722  // rather than no data
2723  if (!STBI__RESTART(z->marker)) return 1;
2724  stbi__jpeg_reset(z);
2725  }
2726  }
2727  }
2728  return 1;
2729  }
2730  else { // interleaved
2731  int i, j, k, x, y;
2732  STBI_SIMD_ALIGN(short, data[64]);
2733  for (j = 0; j < z->img_mcu_y; ++j) {
2734  for (i = 0; i < z->img_mcu_x; ++i) {
2735  // scan an interleaved mcu... process scan_n components in order
2736  for (k = 0; k < z->scan_n; ++k) {
2737  int n = z->order[k];
2738  // scan out an mcu's worth of this component; that's just determined
2739  // by the basic H and V specified for the component
2740  for (y = 0; y < z->img_comp[n].v; ++y) {
2741  for (x = 0; x < z->img_comp[n].h; ++x) {
2742  int x2 = (i * z->img_comp[n].h + x) * 8;
2743  int y2 = (j * z->img_comp[n].v + y) * 8;
2744  int ha = z->img_comp[n].ha;
2745  if (!stbi__jpeg_decode_block(z, data, z->huff_dc + z->img_comp[n].hd, z->huff_ac + ha, z->fast_ac[ha], n, z->dequant[z->img_comp[n].tq])) return 0;
2746  z->idct_block_kernel(z->img_comp[n].data + z->img_comp[n].w2 * y2 + x2, z->img_comp[n].w2, data);
2747  }
2748  }
2749  }
2750  // after all interleaved components, that's an interleaved MCU,
2751  // so now count down the restart interval
2752  if (--z->todo <= 0) {
2753  if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2754  if (!STBI__RESTART(z->marker)) return 1;
2755  stbi__jpeg_reset(z);
2756  }
2757  }
2758  }
2759  return 1;
2760  }
2761  }
2762  else {
2763  if (z->scan_n == 1) {
2764  int i, j;
2765  int n = z->order[0];
2766  // non-interleaved data, we just need to process one block at a time,
2767  // in trivial scanline order
2768  // number of blocks to do just depends on how many actual "pixels" this
2769  // component has, independent of interleaved MCU blocking and such
2770  int w = (z->img_comp[n].x + 7) >> 3;
2771  int h = (z->img_comp[n].y + 7) >> 3;
2772  for (j = 0; j < h; ++j) {
2773  for (i = 0; i < w; ++i) {
2774  short* data = z->img_comp[n].coeff + 64 * (i + j * z->img_comp[n].coeff_w);
2775  if (z->spec_start == 0) {
2776  if (!stbi__jpeg_decode_block_prog_dc(z, data, &z->huff_dc[z->img_comp[n].hd], n))
2777  return 0;
2778  }
2779  else {
2780  int ha = z->img_comp[n].ha;
2781  if (!stbi__jpeg_decode_block_prog_ac(z, data, &z->huff_ac[ha], z->fast_ac[ha]))
2782  return 0;
2783  }
2784  // every data block is an MCU, so countdown the restart interval
2785  if (--z->todo <= 0) {
2786  if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2787  if (!STBI__RESTART(z->marker)) return 1;
2788  stbi__jpeg_reset(z);
2789  }
2790  }
2791  }
2792  return 1;
2793  }
2794  else { // interleaved
2795  int i, j, k, x, y;
2796  for (j = 0; j < z->img_mcu_y; ++j) {
2797  for (i = 0; i < z->img_mcu_x; ++i) {
2798  // scan an interleaved mcu... process scan_n components in order
2799  for (k = 0; k < z->scan_n; ++k) {
2800  int n = z->order[k];
2801  // scan out an mcu's worth of this component; that's just determined
2802  // by the basic H and V specified for the component
2803  for (y = 0; y < z->img_comp[n].v; ++y) {
2804  for (x = 0; x < z->img_comp[n].h; ++x) {
2805  int x2 = (i * z->img_comp[n].h + x);
2806  int y2 = (j * z->img_comp[n].v + y);
2807  short* data = z->img_comp[n].coeff + 64 * (x2 + y2 * z->img_comp[n].coeff_w);
2808  if (!stbi__jpeg_decode_block_prog_dc(z, data, &z->huff_dc[z->img_comp[n].hd], n))
2809  return 0;
2810  }
2811  }
2812  }
2813  // after all interleaved components, that's an interleaved MCU,
2814  // so now count down the restart interval
2815  if (--z->todo <= 0) {
2816  if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2817  if (!STBI__RESTART(z->marker)) return 1;
2818  stbi__jpeg_reset(z);
2819  }
2820  }
2821  }
2822  return 1;
2823  }
2824  }
2825 }
2826 
2827 static void stbi__jpeg_dequantize(short* data, stbi__uint16* dequant)
2828 {
2829  int i;
2830  for (i = 0; i < 64; ++i)
2831  data[i] *= dequant[i];
2832 }
2833 
2834 static void stbi__jpeg_finish(stbi__jpeg* z)
2835 {
2836  if (z->progressive) {
2837  // dequantize and idct the data
2838  int i, j, n;
2839  for (n = 0; n < z->s->img_n; ++n) {
2840  int w = (z->img_comp[n].x + 7) >> 3;
2841  int h = (z->img_comp[n].y + 7) >> 3;
2842  for (j = 0; j < h; ++j) {
2843  for (i = 0; i < w; ++i) {
2844  short* data = z->img_comp[n].coeff + 64 * (i + j * z->img_comp[n].coeff_w);
2845  stbi__jpeg_dequantize(data, z->dequant[z->img_comp[n].tq]);
2846  z->idct_block_kernel(z->img_comp[n].data + z->img_comp[n].w2 * j * 8 + i * 8, z->img_comp[n].w2, data);
2847  }
2848  }
2849  }
2850  }
2851 }
2852 
2853 static int stbi__process_marker(stbi__jpeg* z, int m)
2854 {
2855  int L;
2856  switch (m) {
2857  case STBI__MARKER_none: // no marker found
2858  return stbi__err("expected marker", "Corrupt JPEG");
2859 
2860  case 0xDD: // DRI - specify restart interval
2861  if (stbi__get16be(z->s) != 4) return stbi__err("bad DRI len", "Corrupt JPEG");
2862  z->restart_interval = stbi__get16be(z->s);
2863  return 1;
2864 
2865  case 0xDB: // DQT - define quantization table
2866  L = stbi__get16be(z->s) - 2;
2867  while (L > 0) {
2868  int q = stbi__get8(z->s);
2869  int p = q >> 4, sixteen = (p != 0);
2870  int t = q & 15, i;
2871  if (p != 0 && p != 1) return stbi__err("bad DQT type", "Corrupt JPEG");
2872  if (t > 3) return stbi__err("bad DQT table", "Corrupt JPEG");
2873 
2874  for (i = 0; i < 64; ++i)
2875  z->dequant[t][stbi__jpeg_dezigzag[i]] = (stbi__uint16)(sixteen ? stbi__get16be(z->s) : stbi__get8(z->s));
2876  L -= (sixteen ? 129 : 65);
2877  }
2878  return L == 0;
2879 
2880  case 0xC4: // DHT - define huffman table
2881  L = stbi__get16be(z->s) - 2;
2882  while (L > 0) {
2883  stbi_uc* v;
2884  int sizes[16], i, n = 0;
2885  int q = stbi__get8(z->s);
2886  int tc = q >> 4;
2887  int th = q & 15;
2888  if (tc > 1 || th > 3) return stbi__err("bad DHT header", "Corrupt JPEG");
2889  for (i = 0; i < 16; ++i) {
2890  sizes[i] = stbi__get8(z->s);
2891  n += sizes[i];
2892  }
2893  L -= 17;
2894  if (tc == 0) {
2895  if (!stbi__build_huffman(z->huff_dc + th, sizes)) return 0;
2896  v = z->huff_dc[th].values;
2897  }
2898  else {
2899  if (!stbi__build_huffman(z->huff_ac + th, sizes)) return 0;
2900  v = z->huff_ac[th].values;
2901  }
2902  for (i = 0; i < n; ++i)
2903  v[i] = stbi__get8(z->s);
2904  if (tc != 0)
2905  stbi__build_fast_ac(z->fast_ac[th], z->huff_ac + th);
2906  L -= n;
2907  }
2908  return L == 0;
2909  }
2910 
2911  // check for comment block or APP blocks
2912  if ((m >= 0xE0 && m <= 0xEF) || m == 0xFE) {
2913  L = stbi__get16be(z->s);
2914  if (L < 2) {
2915  if (m == 0xFE)
2916  return stbi__err("bad COM len", "Corrupt JPEG");
2917  else
2918  return stbi__err("bad APP len", "Corrupt JPEG");
2919  }
2920  L -= 2;
2921 
2922  if (m == 0xE0 && L >= 5) { // JFIF APP0 segment
2923  static const unsigned char tag[5] = { 'J','F','I','F','\0' };
2924  int ok = 1;
2925  int i;
2926  for (i = 0; i < 5; ++i)
2927  if (stbi__get8(z->s) != tag[i])
2928  ok = 0;
2929  L -= 5;
2930  if (ok)
2931  z->jfif = 1;
2932  }
2933  else if (m == 0xEE && L >= 12) { // Adobe APP14 segment
2934  static const unsigned char tag[6] = { 'A','d','o','b','e','\0' };
2935  int ok = 1;
2936  int i;
2937  for (i = 0; i < 6; ++i)
2938  if (stbi__get8(z->s) != tag[i])
2939  ok = 0;
2940  L -= 6;
2941  if (ok) {
2942  stbi__get8(z->s); // version
2943  stbi__get16be(z->s); // flags0
2944  stbi__get16be(z->s); // flags1
2945  z->app14_color_transform = stbi__get8(z->s); // color transform
2946  L -= 6;
2947  }
2948  }
2949 
2950  stbi__skip(z->s, L);
2951  return 1;
2952  }
2953 
2954  return stbi__err("unknown marker", "Corrupt JPEG");
2955 }
2956 
2957 // after we see SOS
2958 static int stbi__process_scan_header(stbi__jpeg* z)
2959 {
2960  int i;
2961  int Ls = stbi__get16be(z->s);
2962  z->scan_n = stbi__get8(z->s);
2963  if (z->scan_n < 1 || z->scan_n > 4 || z->scan_n > (int)z->s->img_n) return stbi__err("bad SOS component count", "Corrupt JPEG");
2964  if (Ls != 6 + 2 * z->scan_n) return stbi__err("bad SOS len", "Corrupt JPEG");
2965  for (i = 0; i < z->scan_n; ++i) {
2966  int id = stbi__get8(z->s), which;
2967  int q = stbi__get8(z->s);
2968  for (which = 0; which < z->s->img_n; ++which)
2969  if (z->img_comp[which].id == id)
2970  break;
2971  if (which == z->s->img_n) return 0; // no match
2972  z->img_comp[which].hd = q >> 4; if (z->img_comp[which].hd > 3) return stbi__err("bad DC huff", "Corrupt JPEG");
2973  z->img_comp[which].ha = q & 15; if (z->img_comp[which].ha > 3) return stbi__err("bad AC huff", "Corrupt JPEG");
2974  z->order[i] = which;
2975  }
2976 
2977  {
2978  int aa;
2979  z->spec_start = stbi__get8(z->s);
2980  z->spec_end = stbi__get8(z->s); // should be 63, but might be 0
2981  aa = stbi__get8(z->s);
2982  z->succ_high = (aa >> 4);
2983  z->succ_low = (aa & 15);
2984  if (z->progressive) {
2985  if (z->spec_start > 63 || z->spec_end > 63 || z->spec_start > z->spec_end || z->succ_high > 13 || z->succ_low > 13)
2986  return stbi__err("bad SOS", "Corrupt JPEG");
2987  }
2988  else {
2989  if (z->spec_start != 0) return stbi__err("bad SOS", "Corrupt JPEG");
2990  if (z->succ_high != 0 || z->succ_low != 0) return stbi__err("bad SOS", "Corrupt JPEG");
2991  z->spec_end = 63;
2992  }
2993  }
2994 
2995  return 1;
2996 }
2997 
2998 static int stbi__free_jpeg_components(stbi__jpeg* z, int ncomp, int why)
2999 {
3000  int i;
3001  for (i = 0; i < ncomp; ++i) {
3002  if (z->img_comp[i].raw_data) {
3003  STBI_FREE(z->img_comp[i].raw_data);
3004  z->img_comp[i].raw_data = NULL;
3005  z->img_comp[i].data = NULL;
3006  }
3007  if (z->img_comp[i].raw_coeff) {
3008  STBI_FREE(z->img_comp[i].raw_coeff);
3009  z->img_comp[i].raw_coeff = 0;
3010  z->img_comp[i].coeff = 0;
3011  }
3012  if (z->img_comp[i].linebuf) {
3013  STBI_FREE(z->img_comp[i].linebuf);
3014  z->img_comp[i].linebuf = NULL;
3015  }
3016  }
3017  return why;
3018 }
3019 
3020 static int stbi__process_frame_header(stbi__jpeg* z, int scan)
3021 {
3022  stbi__context* s = z->s;
3023  int Lf, p, i, q, h_max = 1, v_max = 1, c;
3024  Lf = stbi__get16be(s); if (Lf < 11) return stbi__err("bad SOF len", "Corrupt JPEG"); // JPEG
3025  p = stbi__get8(s); if (p != 8) return stbi__err("only 8-bit", "JPEG format not supported: 8-bit only"); // JPEG baseline
3026  s->img_y = stbi__get16be(s); if (s->img_y == 0) return stbi__err("no header height", "JPEG format not supported: delayed height"); // Legal, but we don't handle it--but neither does IJG
3027  s->img_x = stbi__get16be(s); if (s->img_x == 0) return stbi__err("0 width", "Corrupt JPEG"); // JPEG requires
3028  c = stbi__get8(s);
3029  if (c != 3 && c != 1 && c != 4) return stbi__err("bad component count", "Corrupt JPEG");
3030  s->img_n = c;
3031  for (i = 0; i < c; ++i) {
3032  z->img_comp[i].data = NULL;
3033  z->img_comp[i].linebuf = NULL;
3034  }
3035 
3036  if (Lf != 8 + 3 * s->img_n) return stbi__err("bad SOF len", "Corrupt JPEG");
3037 
3038  z->rgb = 0;
3039  for (i = 0; i < s->img_n; ++i) {
3040  static const unsigned char rgb[3] = { 'R', 'G', 'B' };
3041  z->img_comp[i].id = stbi__get8(s);
3042  if (s->img_n == 3 && z->img_comp[i].id == rgb[i])
3043  ++z->rgb;
3044  q = stbi__get8(s);
3045  z->img_comp[i].h = (q >> 4); if (!z->img_comp[i].h || z->img_comp[i].h > 4) return stbi__err("bad H", "Corrupt JPEG");
3046  z->img_comp[i].v = q & 15; if (!z->img_comp[i].v || z->img_comp[i].v > 4) return stbi__err("bad V", "Corrupt JPEG");
3047  z->img_comp[i].tq = stbi__get8(s); if (z->img_comp[i].tq > 3) return stbi__err("bad TQ", "Corrupt JPEG");
3048  }
3049 
3050  if (scan != STBI__SCAN_load) return 1;
3051 
3052  if (!stbi__mad3sizes_valid(s->img_x, s->img_y, s->img_n, 0)) return stbi__err("too large", "Image too large to decode");
3053 
3054  for (i = 0; i < s->img_n; ++i) {
3055  if (z->img_comp[i].h > h_max) h_max = z->img_comp[i].h;
3056  if (z->img_comp[i].v > v_max) v_max = z->img_comp[i].v;
3057  }
3058 
3059  // compute interleaved mcu info
3060  z->img_h_max = h_max;
3061  z->img_v_max = v_max;
3062  z->img_mcu_w = h_max * 8;
3063  z->img_mcu_h = v_max * 8;
3064  // these sizes can't be more than 17 bits
3065  z->img_mcu_x = (s->img_x + z->img_mcu_w - 1) / z->img_mcu_w;
3066  z->img_mcu_y = (s->img_y + z->img_mcu_h - 1) / z->img_mcu_h;
3067 
3068  for (i = 0; i < s->img_n; ++i) {
3069  // number of effective pixels (e.g. for non-interleaved MCU)
3070  z->img_comp[i].x = (s->img_x * z->img_comp[i].h + h_max - 1) / h_max;
3071  z->img_comp[i].y = (s->img_y * z->img_comp[i].v + v_max - 1) / v_max;
3072  // to simplify generation, we'll allocate enough memory to decode
3073  // the bogus oversized data from using interleaved MCUs and their
3074  // big blocks (e.g. a 16x16 iMCU on an image of width 33); we won't
3075  // discard the extra data until colorspace conversion
3076  //
3077  // img_mcu_x, img_mcu_y: <=17 bits; comp[i].h and .v are <=4 (checked earlier)
3078  // so these muls can't overflow with 32-bit ints (which we require)
3079  z->img_comp[i].w2 = z->img_mcu_x * z->img_comp[i].h * 8;
3080  z->img_comp[i].h2 = z->img_mcu_y * z->img_comp[i].v * 8;
3081  z->img_comp[i].coeff = 0;
3082  z->img_comp[i].raw_coeff = 0;
3083  z->img_comp[i].linebuf = NULL;
3084  z->img_comp[i].raw_data = stbi__malloc_mad2(z->img_comp[i].w2, z->img_comp[i].h2, 15);
3085  if (z->img_comp[i].raw_data == NULL)
3086  return stbi__free_jpeg_components(z, i + 1, stbi__err("outofmem", "Out of memory"));
3087  // align blocks for idct using mmx/sse
3088  z->img_comp[i].data = (stbi_uc*)(((size_t)z->img_comp[i].raw_data + 15) & ~15);
3089  if (z->progressive) {
3090  // w2, h2 are multiples of 8 (see above)
3091  z->img_comp[i].coeff_w = z->img_comp[i].w2 / 8;
3092  z->img_comp[i].coeff_h = z->img_comp[i].h2 / 8;
3093  z->img_comp[i].raw_coeff = stbi__malloc_mad3(z->img_comp[i].w2, z->img_comp[i].h2, sizeof(short), 15);
3094  if (z->img_comp[i].raw_coeff == NULL)
3095  return stbi__free_jpeg_components(z, i + 1, stbi__err("outofmem", "Out of memory"));
3096  z->img_comp[i].coeff = (short*)(((size_t)z->img_comp[i].raw_coeff + 15) & ~15);
3097  }
3098  }
3099 
3100  return 1;
3101 }
3102 
3103 // use comparisons since in some cases we handle more than one case (e.g. SOF)
3104 #define stbi__DNL(x) ((x) == 0xdc)
3105 #define stbi__SOI(x) ((x) == 0xd8)
3106 #define stbi__EOI(x) ((x) == 0xd9)
3107 #define stbi__SOF(x) ((x) == 0xc0 || (x) == 0xc1 || (x) == 0xc2)
3108 #define stbi__SOS(x) ((x) == 0xda)
3109 
3110 #define stbi__SOF_progressive(x) ((x) == 0xc2)
3111 
3112 static int stbi__decode_jpeg_header(stbi__jpeg* z, int scan)
3113 {
3114  int m;
3115  z->jfif = 0;
3116  z->app14_color_transform = -1; // valid values are 0,1,2
3117  z->marker = STBI__MARKER_none; // initialize cached marker to empty
3118  m = stbi__get_marker(z);
3119  if (!stbi__SOI(m)) return stbi__err("no SOI", "Corrupt JPEG");
3120  if (scan == STBI__SCAN_type) return 1;
3121  m = stbi__get_marker(z);
3122  while (!stbi__SOF(m)) {
3123  if (!stbi__process_marker(z, m)) return 0;
3124  m = stbi__get_marker(z);
3125  while (m == STBI__MARKER_none) {
3126  // some files have extra padding after their blocks, so ok, we'll scan
3127  if (stbi__at_eof(z->s)) return stbi__err("no SOF", "Corrupt JPEG");
3128  m = stbi__get_marker(z);
3129  }
3130  }
3131  z->progressive = stbi__SOF_progressive(m);
3132  if (!stbi__process_frame_header(z, scan)) return 0;
3133  return 1;
3134 }
3135 
3136 // decode image to YCbCr format
3137 static int stbi__decode_jpeg_image(stbi__jpeg* j)
3138 {
3139  int m;
3140  for (m = 0; m < 4; m++) {
3141  j->img_comp[m].raw_data = NULL;
3142  j->img_comp[m].raw_coeff = NULL;
3143  }
3144  j->restart_interval = 0;
3145  if (!stbi__decode_jpeg_header(j, STBI__SCAN_load)) return 0;
3146  m = stbi__get_marker(j);
3147  while (!stbi__EOI(m)) {
3148  if (stbi__SOS(m)) {
3149  if (!stbi__process_scan_header(j)) return 0;
3150  if (!stbi__parse_entropy_coded_data(j)) return 0;
3151  if (j->marker == STBI__MARKER_none) {
3152  // handle 0s at the end of image data from IP Kamera 9060
3153  while (!stbi__at_eof(j->s)) {
3154  int x = stbi__get8(j->s);
3155  if (x == 255) {
3156  j->marker = stbi__get8(j->s);
3157  break;
3158  }
3159  }
3160  // if we reach eof without hitting a marker, stbi__get_marker() below will fail and we'll eventually return 0
3161  }
3162  }
3163  else if (stbi__DNL(m)) {
3164  int Ld = stbi__get16be(j->s);
3165  stbi__uint32 NL = stbi__get16be(j->s);
3166  if (Ld != 4) return stbi__err("bad DNL len", "Corrupt JPEG");
3167  if (NL != j->s->img_y) return stbi__err("bad DNL height", "Corrupt JPEG");
3168  }
3169  else {
3170  if (!stbi__process_marker(j, m)) return 0;
3171  }
3172  m = stbi__get_marker(j);
3173  }
3174  if (j->progressive)
3175  stbi__jpeg_finish(j);
3176  return 1;
3177 }
3178 
3179 // static jfif-centered resampling (across block boundaries)
3180 
3181 typedef stbi_uc* (*resample_row_func)(stbi_uc* out, stbi_uc* in0, stbi_uc* in1,
3182  int w, int hs);
3183 
3184 #define stbi__div4(x) ((stbi_uc) ((x) >> 2))
3185 
3186 static stbi_uc* resample_row_1(stbi_uc* out, stbi_uc* in_near, stbi_uc* in_far, int w, int hs)
3187 {
3188  STBI_NOTUSED(out);
3189  STBI_NOTUSED(in_far);
3190  STBI_NOTUSED(w);
3191  STBI_NOTUSED(hs);
3192  return in_near;
3193 }
3194 
3195 static stbi_uc* stbi__resample_row_v_2(stbi_uc* out, stbi_uc* in_near, stbi_uc* in_far, int w, int hs)
3196 {
3197  // need to generate two samples vertically for every one in input
3198  int i;
3199  STBI_NOTUSED(hs);
3200  for (i = 0; i < w; ++i)
3201  out[i] = stbi__div4(3 * in_near[i] + in_far[i] + 2);
3202  return out;
3203 }
3204 
3205 static stbi_uc* stbi__resample_row_h_2(stbi_uc* out, stbi_uc* in_near, stbi_uc* in_far, int w, int hs)
3206 {
3207  // need to generate two samples horizontally for every one in input
3208  int i;
3209  stbi_uc* input = in_near;
3210 
3211  if (w == 1) {
3212  // if only one sample, can't do any interpolation
3213  out[0] = out[1] = input[0];
3214  return out;
3215  }
3216 
3217  out[0] = input[0];
3218  out[1] = stbi__div4(input[0] * 3 + input[1] + 2);
3219  for (i = 1; i < w - 1; ++i) {
3220  int n = 3 * input[i] + 2;
3221  out[i * 2 + 0] = stbi__div4(n + input[i - 1]);
3222  out[i * 2 + 1] = stbi__div4(n + input[i + 1]);
3223  }
3224  out[i * 2 + 0] = stbi__div4(input[w - 2] * 3 + input[w - 1] + 2);
3225  out[i * 2 + 1] = input[w - 1];
3226 
3227  STBI_NOTUSED(in_far);
3228  STBI_NOTUSED(hs);
3229 
3230  return out;
3231 }
3232 
3233 #define stbi__div16(x) ((stbi_uc) ((x) >> 4))
3234 
3235 static stbi_uc* stbi__resample_row_hv_2(stbi_uc* out, stbi_uc* in_near, stbi_uc* in_far, int w, int hs)
3236 {
3237  // need to generate 2x2 samples for every one in input
3238  int i, t0, t1;
3239  if (w == 1) {
3240  out[0] = out[1] = stbi__div4(3 * in_near[0] + in_far[0] + 2);
3241  return out;
3242  }
3243 
3244  t1 = 3 * in_near[0] + in_far[0];
3245  out[0] = stbi__div4(t1 + 2);
3246  for (i = 1; i < w; ++i) {
3247  t0 = t1;
3248  t1 = 3 * in_near[i] + in_far[i];
3249  out[i * 2 - 1] = stbi__div16(3 * t0 + t1 + 8);
3250  out[i * 2] = stbi__div16(3 * t1 + t0 + 8);
3251  }
3252  out[w * 2 - 1] = stbi__div4(t1 + 2);
3253 
3254  STBI_NOTUSED(hs);
3255 
3256  return out;
3257 }
3258 
3259 #if defined(STBI_SSE2) || defined(STBI_NEON)
3260 static stbi_uc* stbi__resample_row_hv_2_simd(stbi_uc* out, stbi_uc* in_near, stbi_uc* in_far, int w, int hs)
3261 {
3262  // need to generate 2x2 samples for every one in input
3263  int i = 0, t0, t1;
3264 
3265  if (w == 1) {
3266  out[0] = out[1] = stbi__div4(3 * in_near[0] + in_far[0] + 2);
3267  return out;
3268  }
3269 
3270  t1 = 3 * in_near[0] + in_far[0];
3271  // process groups of 8 pixels for as long as we can.
3272  // note we can't handle the last pixel in a row in this loop
3273  // because we need to handle the filter boundary conditions.
3274  for (; i < ((w - 1) & ~7); i += 8) {
3275 #if defined(STBI_SSE2)
3276  // load and perform the vertical filtering pass
3277  // this uses 3*x + y = 4*x + (y - x)
3278  __m128i zero = _mm_setzero_si128();
3279  __m128i farb = _mm_loadl_epi64((__m128i*) (in_far + i));
3280  __m128i nearb = _mm_loadl_epi64((__m128i*) (in_near + i));
3281  __m128i farw = _mm_unpacklo_epi8(farb, zero);
3282  __m128i nearw = _mm_unpacklo_epi8(nearb, zero);
3283  __m128i diff = _mm_sub_epi16(farw, nearw);
3284  __m128i nears = _mm_slli_epi16(nearw, 2);
3285  __m128i curr = _mm_add_epi16(nears, diff); // current row
3286 
3287  // horizontal filter works the same based on shifted vers of current
3288  // row. "prev" is current row shifted right by 1 pixel; we need to
3289  // insert the previous pixel value (from t1).
3290  // "next" is current row shifted left by 1 pixel, with first pixel
3291  // of next block of 8 pixels added in.
3292  __m128i prv0 = _mm_slli_si128(curr, 2);
3293  __m128i nxt0 = _mm_srli_si128(curr, 2);
3294  __m128i prev = _mm_insert_epi16(prv0, t1, 0);
3295  __m128i next = _mm_insert_epi16(nxt0, 3 * in_near[i + 8] + in_far[i + 8], 7);
3296 
3297  // horizontal filter, polyphase implementation since it's convenient:
3298  // even pixels = 3*cur + prev = cur*4 + (prev - cur)
3299  // odd pixels = 3*cur + next = cur*4 + (next - cur)
3300  // note the shared term.
3301  __m128i bias = _mm_set1_epi16(8);
3302  __m128i curs = _mm_slli_epi16(curr, 2);
3303  __m128i prvd = _mm_sub_epi16(prev, curr);
3304  __m128i nxtd = _mm_sub_epi16(next, curr);
3305  __m128i curb = _mm_add_epi16(curs, bias);
3306  __m128i even = _mm_add_epi16(prvd, curb);
3307  __m128i odd = _mm_add_epi16(nxtd, curb);
3308 
3309  // interleave even and odd pixels, then undo scaling.
3310  __m128i int0 = _mm_unpacklo_epi16(even, odd);
3311  __m128i int1 = _mm_unpackhi_epi16(even, odd);
3312  __m128i de0 = _mm_srli_epi16(int0, 4);
3313  __m128i de1 = _mm_srli_epi16(int1, 4);
3314 
3315  // pack and write output
3316  __m128i outv = _mm_packus_epi16(de0, de1);
3317  _mm_storeu_si128((__m128i*) (out + i * 2), outv);
3318 #elif defined(STBI_NEON)
3319  // load and perform the vertical filtering pass
3320  // this uses 3*x + y = 4*x + (y - x)
3321  uint8x8_t farb = vld1_u8(in_far + i);
3322  uint8x8_t nearb = vld1_u8(in_near + i);
3323  int16x8_t diff = vreinterpretq_s16_u16(vsubl_u8(farb, nearb));
3324  int16x8_t nears = vreinterpretq_s16_u16(vshll_n_u8(nearb, 2));
3325  int16x8_t curr = vaddq_s16(nears, diff); // current row
3326 
3327  // horizontal filter works the same based on shifted vers of current
3328  // row. "prev" is current row shifted right by 1 pixel; we need to
3329  // insert the previous pixel value (from t1).
3330  // "next" is current row shifted left by 1 pixel, with first pixel
3331  // of next block of 8 pixels added in.
3332  int16x8_t prv0 = vextq_s16(curr, curr, 7);
3333  int16x8_t nxt0 = vextq_s16(curr, curr, 1);
3334  int16x8_t prev = vsetq_lane_s16(t1, prv0, 0);
3335  int16x8_t next = vsetq_lane_s16(3 * in_near[i + 8] + in_far[i + 8], nxt0, 7);
3336 
3337  // horizontal filter, polyphase implementation since it's convenient:
3338  // even pixels = 3*cur + prev = cur*4 + (prev - cur)
3339  // odd pixels = 3*cur + next = cur*4 + (next - cur)
3340  // note the shared term.
3341  int16x8_t curs = vshlq_n_s16(curr, 2);
3342  int16x8_t prvd = vsubq_s16(prev, curr);
3343  int16x8_t nxtd = vsubq_s16(next, curr);
3344  int16x8_t even = vaddq_s16(curs, prvd);
3345  int16x8_t odd = vaddq_s16(curs, nxtd);
3346 
3347  // undo scaling and round, then store with even/odd phases interleaved
3348  uint8x8x2_t o;
3349  o.val[0] = vqrshrun_n_s16(even, 4);
3350  o.val[1] = vqrshrun_n_s16(odd, 4);
3351  vst2_u8(out + i * 2, o);
3352 #endif
3353 
3354  // "previous" value for next iter
3355  t1 = 3 * in_near[i + 7] + in_far[i + 7];
3356  }
3357 
3358  t0 = t1;
3359  t1 = 3 * in_near[i] + in_far[i];
3360  out[i * 2] = stbi__div16(3 * t1 + t0 + 8);
3361 
3362  for (++i; i < w; ++i) {
3363  t0 = t1;
3364  t1 = 3 * in_near[i] + in_far[i];
3365  out[i * 2 - 1] = stbi__div16(3 * t0 + t1 + 8);
3366  out[i * 2] = stbi__div16(3 * t1 + t0 + 8);
3367  }
3368  out[w * 2 - 1] = stbi__div4(t1 + 2);
3369 
3370  STBI_NOTUSED(hs);
3371 
3372  return out;
3373 }
3374 #endif
3375 
3376 static stbi_uc* stbi__resample_row_generic(stbi_uc* out, stbi_uc* in_near, stbi_uc* in_far, int w, int hs)
3377 {
3378  // resample with nearest-neighbor
3379  int i, j;
3380  STBI_NOTUSED(in_far);
3381  for (i = 0; i < w; ++i)
3382  for (j = 0; j < hs; ++j)
3383  out[i * hs + j] = in_near[i];
3384  return out;
3385 }
3386 
3387 // this is a reduced-precision calculation of YCbCr-to-RGB introduced
3388 // to make sure the code produces the same results in both SIMD and scalar
3389 #define stbi__float2fixed(x) (((int) ((x) * 4096.0f + 0.5f)) << 8)
3390 static void stbi__YCbCr_to_RGB_row(stbi_uc* out, const stbi_uc* y, const stbi_uc* pcb, const stbi_uc* pcr, int count, int step)
3391 {
3392  int i;
3393  for (i = 0; i < count; ++i) {
3394  int y_fixed = (y[i] << 20) + (1 << 19); // rounding
3395  int r, g, b;
3396  int cr = pcr[i] - 128;
3397  int cb = pcb[i] - 128;
3398  r = y_fixed + cr * stbi__float2fixed(1.40200f);
3399  g = y_fixed + (cr * -stbi__float2fixed(0.71414f)) + ((cb * -stbi__float2fixed(0.34414f)) & 0xffff0000);
3400  b = y_fixed + cb * stbi__float2fixed(1.77200f);
3401  r >>= 20;
3402  g >>= 20;
3403  b >>= 20;
3404  if ((unsigned)r > 255) { if (r < 0) r = 0; else r = 255; }
3405  if ((unsigned)g > 255) { if (g < 0) g = 0; else g = 255; }
3406  if ((unsigned)b > 255) { if (b < 0) b = 0; else b = 255; }
3407  out[0] = (stbi_uc)r;
3408  out[1] = (stbi_uc)g;
3409  out[2] = (stbi_uc)b;
3410  out[3] = 255;
3411  out += step;
3412  }
3413 }
3414 
3415 #if defined(STBI_SSE2) || defined(STBI_NEON)
3416 static void stbi__YCbCr_to_RGB_simd(stbi_uc* out, stbi_uc const* y, stbi_uc const* pcb, stbi_uc const* pcr, int count, int step)
3417 {
3418  int i = 0;
3419 
3420 #ifdef STBI_SSE2
3421  // step == 3 is pretty ugly on the final interleave, and i'm not convinced
3422  // it's useful in practice (you wouldn't use it for textures, for example).
3423  // so just accelerate step == 4 case.
3424  if (step == 4) {
3425  // this is a fairly straightforward implementation and not super-optimized.
3426  __m128i signflip = _mm_set1_epi8(-0x80);
3427  __m128i cr_const0 = _mm_set1_epi16((short)(1.40200f * 4096.0f + 0.5f));
3428  __m128i cr_const1 = _mm_set1_epi16(-(short)(0.71414f * 4096.0f + 0.5f));
3429  __m128i cb_const0 = _mm_set1_epi16(-(short)(0.34414f * 4096.0f + 0.5f));
3430  __m128i cb_const1 = _mm_set1_epi16((short)(1.77200f * 4096.0f + 0.5f));
3431  __m128i y_bias = _mm_set1_epi8((char)(unsigned char)128);
3432  __m128i xw = _mm_set1_epi16(255); // alpha channel
3433 
3434  for (; i + 7 < count; i += 8) {
3435  // load
3436  __m128i y_bytes = _mm_loadl_epi64((__m128i*) (y + i));
3437  __m128i cr_bytes = _mm_loadl_epi64((__m128i*) (pcr + i));
3438  __m128i cb_bytes = _mm_loadl_epi64((__m128i*) (pcb + i));
3439  __m128i cr_biased = _mm_xor_si128(cr_bytes, signflip); // -128
3440  __m128i cb_biased = _mm_xor_si128(cb_bytes, signflip); // -128
3441 
3442  // unpack to short (and left-shift cr, cb by 8)
3443  __m128i yw = _mm_unpacklo_epi8(y_bias, y_bytes);
3444  __m128i crw = _mm_unpacklo_epi8(_mm_setzero_si128(), cr_biased);
3445  __m128i cbw = _mm_unpacklo_epi8(_mm_setzero_si128(), cb_biased);
3446 
3447  // color transform
3448  __m128i yws = _mm_srli_epi16(yw, 4);
3449  __m128i cr0 = _mm_mulhi_epi16(cr_const0, crw);
3450  __m128i cb0 = _mm_mulhi_epi16(cb_const0, cbw);
3451  __m128i cb1 = _mm_mulhi_epi16(cbw, cb_const1);
3452  __m128i cr1 = _mm_mulhi_epi16(crw, cr_const1);
3453  __m128i rws = _mm_add_epi16(cr0, yws);
3454  __m128i gwt = _mm_add_epi16(cb0, yws);
3455  __m128i bws = _mm_add_epi16(yws, cb1);
3456  __m128i gws = _mm_add_epi16(gwt, cr1);
3457 
3458  // descale
3459  __m128i rw = _mm_srai_epi16(rws, 4);
3460  __m128i bw = _mm_srai_epi16(bws, 4);
3461  __m128i gw = _mm_srai_epi16(gws, 4);
3462 
3463  // back to byte, set up for transpose
3464  __m128i brb = _mm_packus_epi16(rw, bw);
3465  __m128i gxb = _mm_packus_epi16(gw, xw);
3466 
3467  // transpose to interleave channels
3468  __m128i t0 = _mm_unpacklo_epi8(brb, gxb);
3469  __m128i t1 = _mm_unpackhi_epi8(brb, gxb);
3470  __m128i o0 = _mm_unpacklo_epi16(t0, t1);
3471  __m128i o1 = _mm_unpackhi_epi16(t0, t1);
3472 
3473  // store
3474  _mm_storeu_si128((__m128i*) (out + 0), o0);
3475  _mm_storeu_si128((__m128i*) (out + 16), o1);
3476  out += 32;
3477  }
3478  }
3479 #endif
3480 
3481 #ifdef STBI_NEON
3482  // in this version, step=3 support would be easy to add. but is there demand?
3483  if (step == 4) {
3484  // this is a fairly straightforward implementation and not super-optimized.
3485  uint8x8_t signflip = vdup_n_u8(0x80);
3486  int16x8_t cr_const0 = vdupq_n_s16((short)(1.40200f * 4096.0f + 0.5f));
3487  int16x8_t cr_const1 = vdupq_n_s16(-(short)(0.71414f * 4096.0f + 0.5f));
3488  int16x8_t cb_const0 = vdupq_n_s16(-(short)(0.34414f * 4096.0f + 0.5f));
3489  int16x8_t cb_const1 = vdupq_n_s16((short)(1.77200f * 4096.0f + 0.5f));
3490 
3491  for (; i + 7 < count; i += 8) {
3492  // load
3493  uint8x8_t y_bytes = vld1_u8(y + i);
3494  uint8x8_t cr_bytes = vld1_u8(pcr + i);
3495  uint8x8_t cb_bytes = vld1_u8(pcb + i);
3496  int8x8_t cr_biased = vreinterpret_s8_u8(vsub_u8(cr_bytes, signflip));
3497  int8x8_t cb_biased = vreinterpret_s8_u8(vsub_u8(cb_bytes, signflip));
3498 
3499  // expand to s16
3500  int16x8_t yws = vreinterpretq_s16_u16(vshll_n_u8(y_bytes, 4));
3501  int16x8_t crw = vshll_n_s8(cr_biased, 7);
3502  int16x8_t cbw = vshll_n_s8(cb_biased, 7);
3503 
3504  // color transform
3505  int16x8_t cr0 = vqdmulhq_s16(crw, cr_const0);
3506  int16x8_t cb0 = vqdmulhq_s16(cbw, cb_const0);
3507  int16x8_t cr1 = vqdmulhq_s16(crw, cr_const1);
3508  int16x8_t cb1 = vqdmulhq_s16(cbw, cb_const1);
3509  int16x8_t rws = vaddq_s16(yws, cr0);
3510  int16x8_t gws = vaddq_s16(vaddq_s16(yws, cb0), cr1);
3511  int16x8_t bws = vaddq_s16(yws, cb1);
3512 
3513  // undo scaling, round, convert to byte
3514  uint8x8x4_t o;
3515  o.val[0] = vqrshrun_n_s16(rws, 4);
3516  o.val[1] = vqrshrun_n_s16(gws, 4);
3517  o.val[2] = vqrshrun_n_s16(bws, 4);
3518  o.val[3] = vdup_n_u8(255);
3519 
3520  // store, interleaving r/g/b/a
3521  vst4_u8(out, o);
3522  out += 8 * 4;
3523  }
3524  }
3525 #endif
3526 
3527  for (; i < count; ++i) {
3528  int y_fixed = (y[i] << 20) + (1 << 19); // rounding
3529  int r, g, b;
3530  int cr = pcr[i] - 128;
3531  int cb = pcb[i] - 128;
3532  r = y_fixed + cr * stbi__float2fixed(1.40200f);
3533  g = y_fixed + cr * -stbi__float2fixed(0.71414f) + ((cb * -stbi__float2fixed(0.34414f)) & 0xffff0000);
3534  b = y_fixed + cb * stbi__float2fixed(1.77200f);
3535  r >>= 20;
3536  g >>= 20;
3537  b >>= 20;
3538  if ((unsigned)r > 255) { if (r < 0) r = 0; else r = 255; }
3539  if ((unsigned)g > 255) { if (g < 0) g = 0; else g = 255; }
3540  if ((unsigned)b > 255) { if (b < 0) b = 0; else b = 255; }
3541  out[0] = (stbi_uc)r;
3542  out[1] = (stbi_uc)g;
3543  out[2] = (stbi_uc)b;
3544  out[3] = 255;
3545  out += step;
3546  }
3547 }
3548 #endif
3549 
3550 // set up the kernels
3551 static void stbi__setup_jpeg(stbi__jpeg* j)
3552 {
3553  j->idct_block_kernel = stbi__idct_block;
3554  j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_row;
3555  j->resample_row_hv_2_kernel = stbi__resample_row_hv_2;
3556 
3557 #ifdef STBI_SSE2
3558  if (stbi__sse2_available()) {
3559  j->idct_block_kernel = stbi__idct_simd;
3560  j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_simd;
3561  j->resample_row_hv_2_kernel = stbi__resample_row_hv_2_simd;
3562  }
3563 #endif
3564 
3565 #ifdef STBI_NEON
3566  j->idct_block_kernel = stbi__idct_simd;
3567  j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_simd;
3568  j->resample_row_hv_2_kernel = stbi__resample_row_hv_2_simd;
3569 #endif
3570 }
3571 
3572 // clean up the temporary component buffers
3573 static void stbi__cleanup_jpeg(stbi__jpeg* j)
3574 {
3575  stbi__free_jpeg_components(j, j->s->img_n, 0);
3576 }
3577 
3578 typedef struct
3579 {
3580  resample_row_func resample;
3581  stbi_uc* line0, * line1;
3582  int hs, vs; // expansion factor in each axis
3583  int w_lores; // horizontal pixels pre-expansion
3584  int ystep; // how far through vertical expansion we are
3585  int ypos; // which pre-expansion row we're on
3586 } stbi__resample;
3587 
3588 // fast 0..255 * 0..255 => 0..255 rounded multiplication
3589 static stbi_uc stbi__blinn_8x8(stbi_uc x, stbi_uc y)
3590 {
3591  unsigned int t = x * y + 128;
3592  return (stbi_uc)((t + (t >> 8)) >> 8);
3593 }
3594 
3595 static stbi_uc* load_jpeg_image(stbi__jpeg* z, int* out_x, int* out_y, int* comp, int req_comp)
3596 {
3597  int n, decode_n, is_rgb;
3598  z->s->img_n = 0; // make stbi__cleanup_jpeg safe
3599 
3600  // validate req_comp
3601  if (req_comp < 0 || req_comp > 4) return stbi__errpuc("bad req_comp", "Internal error");
3602 
3603  // load a jpeg image from whichever source, but leave in YCbCr format
3604  if (!stbi__decode_jpeg_image(z)) { stbi__cleanup_jpeg(z); return NULL; }
3605 
3606  // determine actual number of components to generate
3607  n = req_comp ? req_comp : z->s->img_n >= 3 ? 3 : 1;
3608 
3609  is_rgb = z->s->img_n == 3 && (z->rgb == 3 || (z->app14_color_transform == 0 && !z->jfif));
3610 
3611  if (z->s->img_n == 3 && n < 3 && !is_rgb)
3612  decode_n = 1;
3613  else
3614  decode_n = z->s->img_n;
3615 
3616  // resample and color-convert
3617  {
3618  int k;
3619  unsigned int i, j;
3620  stbi_uc* output;
3621  stbi_uc* coutput[4];
3622 
3623  stbi__resample res_comp[4];
3624 
3625  for (k = 0; k < decode_n; ++k) {
3626  stbi__resample* r = &res_comp[k];
3627 
3628  // allocate line buffer big enough for upsampling off the edges
3629  // with upsample factor of 4
3630  z->img_comp[k].linebuf = (stbi_uc*)stbi__malloc(z->s->img_x + 3);
3631  if (!z->img_comp[k].linebuf) { stbi__cleanup_jpeg(z); return stbi__errpuc("outofmem", "Out of memory"); }
3632 
3633  r->hs = z->img_h_max / z->img_comp[k].h;
3634  r->vs = z->img_v_max / z->img_comp[k].v;
3635  r->ystep = r->vs >> 1;
3636  r->w_lores = (z->s->img_x + r->hs - 1) / r->hs;
3637  r->ypos = 0;
3638  r->line0 = r->line1 = z->img_comp[k].data;
3639 
3640  if (r->hs == 1 && r->vs == 1) r->resample = resample_row_1;
3641  else if (r->hs == 1 && r->vs == 2) r->resample = stbi__resample_row_v_2;
3642  else if (r->hs == 2 && r->vs == 1) r->resample = stbi__resample_row_h_2;
3643  else if (r->hs == 2 && r->vs == 2) r->resample = z->resample_row_hv_2_kernel;
3644  else r->resample = stbi__resample_row_generic;
3645  }
3646 
3647  // can't error after this so, this is safe
3648  output = (stbi_uc*)stbi__malloc_mad3(n, z->s->img_x, z->s->img_y, 1);
3649  if (!output) { stbi__cleanup_jpeg(z); return stbi__errpuc("outofmem", "Out of memory"); }
3650 
3651  // now go ahead and resample
3652  for (j = 0; j < z->s->img_y; ++j) {
3653  stbi_uc* out = output + n * z->s->img_x * j;
3654  for (k = 0; k < decode_n; ++k) {
3655  stbi__resample* r = &res_comp[k];
3656  int y_bot = r->ystep >= (r->vs >> 1);
3657  coutput[k] = r->resample(z->img_comp[k].linebuf,
3658  y_bot ? r->line1 : r->line0,
3659  y_bot ? r->line0 : r->line1,
3660  r->w_lores, r->hs);
3661  if (++r->ystep >= r->vs) {
3662  r->ystep = 0;
3663  r->line0 = r->line1;
3664  if (++r->ypos < z->img_comp[k].y)
3665  r->line1 += z->img_comp[k].w2;
3666  }
3667  }
3668  if (n >= 3) {
3669  stbi_uc* y = coutput[0];
3670  if (z->s->img_n == 3) {
3671  if (is_rgb) {
3672  for (i = 0; i < z->s->img_x; ++i) {
3673  out[0] = y[i];
3674  out[1] = coutput[1][i];
3675  out[2] = coutput[2][i];
3676  out[3] = 255;
3677  out += n;
3678  }
3679  }
3680  else {
3681  z->YCbCr_to_RGB_kernel(out, y, coutput[1], coutput[2], z->s->img_x, n);
3682  }
3683  }
3684  else if (z->s->img_n == 4) {
3685  if (z->app14_color_transform == 0) { // CMYK
3686  for (i = 0; i < z->s->img_x; ++i) {
3687  stbi_uc m = coutput[3][i];
3688  out[0] = stbi__blinn_8x8(coutput[0][i], m);
3689  out[1] = stbi__blinn_8x8(coutput[1][i], m);
3690  out[2] = stbi__blinn_8x8(coutput[2][i], m);
3691  out[3] = 255;
3692  out += n;
3693  }
3694  }
3695  else if (z->app14_color_transform == 2) { // YCCK
3696  z->YCbCr_to_RGB_kernel(out, y, coutput[1], coutput[2], z->s->img_x, n);
3697  for (i = 0; i < z->s->img_x; ++i) {
3698  stbi_uc m = coutput[3][i];
3699  out[0] = stbi__blinn_8x8(255 - out[0], m);
3700  out[1] = stbi__blinn_8x8(255 - out[1], m);
3701  out[2] = stbi__blinn_8x8(255 - out[2], m);
3702  out += n;
3703  }
3704  }
3705  else { // YCbCr + alpha? Ignore the fourth channel for now
3706  z->YCbCr_to_RGB_kernel(out, y, coutput[1], coutput[2], z->s->img_x, n);
3707  }
3708  }
3709  else
3710  for (i = 0; i < z->s->img_x; ++i) {
3711  out[0] = out[1] = out[2] = y[i];
3712  out[3] = 255; // not used if n==3
3713  out += n;
3714  }
3715  }
3716  else {
3717  if (is_rgb) {
3718  if (n == 1)
3719  for (i = 0; i < z->s->img_x; ++i)
3720  * out++ = stbi__compute_y(coutput[0][i], coutput[1][i], coutput[2][i]);
3721  else {
3722  for (i = 0; i < z->s->img_x; ++i, out += 2) {
3723  out[0] = stbi__compute_y(coutput[0][i], coutput[1][i], coutput[2][i]);
3724  out[1] = 255;
3725  }
3726  }
3727  }
3728  else if (z->s->img_n == 4 && z->app14_color_transform == 0) {
3729  for (i = 0; i < z->s->img_x; ++i) {
3730  stbi_uc m = coutput[3][i];
3731  stbi_uc r = stbi__blinn_8x8(coutput[0][i], m);
3732  stbi_uc g = stbi__blinn_8x8(coutput[1][i], m);
3733  stbi_uc b = stbi__blinn_8x8(coutput[2][i], m);
3734  out[0] = stbi__compute_y(r, g, b);
3735  out[1] = 255;
3736  out += n;
3737  }
3738  }
3739  else if (z->s->img_n == 4 && z->app14_color_transform == 2) {
3740  for (i = 0; i < z->s->img_x; ++i) {
3741  out[0] = stbi__blinn_8x8(255 - coutput[0][i], coutput[3][i]);
3742  out[1] = 255;
3743  out += n;
3744  }
3745  }
3746  else {
3747  stbi_uc* y = coutput[0];
3748  if (n == 1)
3749  for (i = 0; i < z->s->img_x; ++i) out[i] = y[i];
3750  else
3751  for (i = 0; i < z->s->img_x; ++i)* out++ = y[i], * out++ = 255;
3752  }
3753  }
3754  }
3755  stbi__cleanup_jpeg(z);
3756  *out_x = z->s->img_x;
3757  *out_y = z->s->img_y;
3758  if (comp)* comp = z->s->img_n >= 3 ? 3 : 1; // report original components, not output
3759  return output;
3760  }
3761 }
3762 
3763 static void* stbi__jpeg_load(stbi__context* s, int* x, int* y, int* comp, int req_comp, stbi__result_info* ri)
3764 {
3765  unsigned char* result;
3766  stbi__jpeg* j = (stbi__jpeg*)stbi__malloc(sizeof(stbi__jpeg));
3767  STBI_NOTUSED(ri);
3768  j->s = s;
3769  stbi__setup_jpeg(j);
3770  result = load_jpeg_image(j, x, y, comp, req_comp);
3771  STBI_FREE(j);
3772  return result;
3773 }
3774 
3775 static int stbi__jpeg_test(stbi__context* s)
3776 {
3777  int r;
3778  stbi__jpeg* j = (stbi__jpeg*)stbi__malloc(sizeof(stbi__jpeg));
3779  j->s = s;
3780  stbi__setup_jpeg(j);
3781  r = stbi__decode_jpeg_header(j, STBI__SCAN_type);
3782  stbi__rewind(s);
3783  STBI_FREE(j);
3784  return r;
3785 }
3786 
3787 static int stbi__jpeg_info_raw(stbi__jpeg* j, int* x, int* y, int* comp)
3788 {
3789  if (!stbi__decode_jpeg_header(j, STBI__SCAN_header)) {
3790  stbi__rewind(j->s);
3791  return 0;
3792  }
3793  if (x)* x = j->s->img_x;
3794  if (y)* y = j->s->img_y;
3795  if (comp)* comp = j->s->img_n >= 3 ? 3 : 1;
3796  return 1;
3797 }
3798 
3799 static int stbi__jpeg_info(stbi__context* s, int* x, int* y, int* comp)
3800 {
3801  int result;
3802  stbi__jpeg* j = (stbi__jpeg*)(stbi__malloc(sizeof(stbi__jpeg)));
3803  j->s = s;
3804  result = stbi__jpeg_info_raw(j, x, y, comp);
3805  STBI_FREE(j);
3806  return result;
3807 }
3808 #endif
3809 
3810 // public domain zlib decode v0.2 Sean Barrett 2006-11-18
3811 // simple implementation
3812 // - all input must be provided in an upfront buffer
3813 // - all output is written to a single output buffer (can malloc/realloc)
3814 // performance
3815 // - fast huffman
3816 
3817 #ifndef STBI_NO_ZLIB
3818 
3819 // fast-way is faster to check than jpeg huffman, but slow way is slower
3820 #define STBI__ZFAST_BITS 9 // accelerate all cases in default tables
3821 #define STBI__ZFAST_MASK ((1 << STBI__ZFAST_BITS) - 1)
3822 
3823 // zlib-style huffman encoding
3824 // (jpegs packs from left, zlib from right, so can't share code)
3825 typedef struct
3826 {
3827  stbi__uint16 fast[1 << STBI__ZFAST_BITS];
3828  stbi__uint16 firstcode[16];
3829  int maxcode[17];
3830  stbi__uint16 firstsymbol[16];
3831  stbi_uc size[288];
3832  stbi__uint16 value[288];
3833 } stbi__zhuffman;
3834 
3835 stbi_inline static int stbi__bitreverse16(int n)
3836 {
3837  n = ((n & 0xAAAA) >> 1) | ((n & 0x5555) << 1);
3838  n = ((n & 0xCCCC) >> 2) | ((n & 0x3333) << 2);
3839  n = ((n & 0xF0F0) >> 4) | ((n & 0x0F0F) << 4);
3840  n = ((n & 0xFF00) >> 8) | ((n & 0x00FF) << 8);
3841  return n;
3842 }
3843 
3844 stbi_inline static int stbi__bit_reverse(int v, int bits)
3845 {
3846  STBI_ASSERT(bits <= 16);
3847  // to bit reverse n bits, reverse 16 and shift
3848  // e.g. 11 bits, bit reverse and shift away 5
3849  return stbi__bitreverse16(v) >> (16 - bits);
3850 }
3851 
3852 static int stbi__zbuild_huffman(stbi__zhuffman* z, const stbi_uc* sizelist, int num)
3853 {
3854  int i, k = 0;
3855  int code, next_code[16], sizes[17];
3856 
3857  // DEFLATE spec for generating codes
3858  memset(sizes, 0, sizeof(sizes));
3859  memset(z->fast, 0, sizeof(z->fast));
3860  for (i = 0; i < num; ++i)
3861  ++sizes[sizelist[i]];
3862  sizes[0] = 0;
3863  for (i = 1; i < 16; ++i)
3864  if (sizes[i] > (1 << i))
3865  return stbi__err("bad sizes", "Corrupt PNG");
3866  code = 0;
3867  for (i = 1; i < 16; ++i) {
3868  next_code[i] = code;
3869  z->firstcode[i] = (stbi__uint16)code;
3870  z->firstsymbol[i] = (stbi__uint16)k;
3871  code = (code + sizes[i]);
3872  if (sizes[i])
3873  if (code - 1 >= (1 << i)) return stbi__err("bad codelengths", "Corrupt PNG");
3874  z->maxcode[i] = code << (16 - i); // preshift for inner loop
3875  code <<= 1;
3876  k += sizes[i];
3877  }
3878  z->maxcode[16] = 0x10000; // sentinel
3879  for (i = 0; i < num; ++i) {
3880  int s = sizelist[i];
3881  if (s) {
3882  int c = next_code[s] - z->firstcode[s] + z->firstsymbol[s];
3883  stbi__uint16 fastv = (stbi__uint16)((s << 9) | i);
3884  z->size[c] = (stbi_uc)s;
3885  z->value[c] = (stbi__uint16)i;
3886  if (s <= STBI__ZFAST_BITS) {
3887  int j = stbi__bit_reverse(next_code[s], s);
3888  while (j < (1 << STBI__ZFAST_BITS)) {
3889  z->fast[j] = fastv;
3890  j += (1 << s);
3891  }
3892  }
3893  ++next_code[s];
3894  }
3895  }
3896  return 1;
3897 }
3898 
3899 // zlib-from-memory implementation for PNG reading
3900 // because PNG allows splitting the zlib stream arbitrarily,
3901 // and it's annoying structurally to have PNG call ZLIB call PNG,
3902 // we require PNG read all the IDATs and combine them into a single
3903 // memory buffer
3904 
3905 typedef struct
3906 {
3907  stbi_uc* zbuffer, * zbuffer_end;
3908  int num_bits;
3909  stbi__uint32 code_buffer;
3910 
3911  char* zout;
3912  char* zout_start;
3913  char* zout_end;
3914  int z_expandable;
3915 
3916  stbi__zhuffman z_length, z_distance;
3917 } stbi__zbuf;
3918 
3919 stbi_inline static stbi_uc stbi__zget8(stbi__zbuf* z)
3920 {
3921  if (z->zbuffer >= z->zbuffer_end) return 0;
3922  return *z->zbuffer++;
3923 }
3924 
3925 static void stbi__fill_bits(stbi__zbuf* z)
3926 {
3927  do {
3928  STBI_ASSERT(z->code_buffer < (1U << z->num_bits));
3929  z->code_buffer |= (unsigned int)stbi__zget8(z) << z->num_bits;
3930  z->num_bits += 8;
3931  } while (z->num_bits <= 24);
3932 }
3933 
3934 stbi_inline static unsigned int stbi__zreceive(stbi__zbuf* z, int n)
3935 {
3936  unsigned int k;
3937  if (z->num_bits < n) stbi__fill_bits(z);
3938  k = z->code_buffer & ((1 << n) - 1);
3939  z->code_buffer >>= n;
3940  z->num_bits -= n;
3941  return k;
3942 }
3943 
3944 static int stbi__zhuffman_decode_slowpath(stbi__zbuf* a, stbi__zhuffman* z)
3945 {
3946  int b, s, k;
3947  // not resolved by fast table, so compute it the slow way
3948  // use jpeg approach, which requires MSbits at top
3949  k = stbi__bit_reverse(a->code_buffer, 16);
3950  for (s = STBI__ZFAST_BITS + 1; ; ++s)
3951  if (k < z->maxcode[s])
3952  break;
3953  if (s == 16) return -1; // invalid code!
3954  // code size is s, so:
3955  b = (k >> (16 - s)) - z->firstcode[s] + z->firstsymbol[s];
3956  STBI_ASSERT(z->size[b] == s);
3957  a->code_buffer >>= s;
3958  a->num_bits -= s;
3959  return z->value[b];
3960 }
3961 
3962 stbi_inline static int stbi__zhuffman_decode(stbi__zbuf* a, stbi__zhuffman* z)
3963 {
3964  int b, s;
3965  if (a->num_bits < 16) stbi__fill_bits(a);
3966  b = z->fast[a->code_buffer & STBI__ZFAST_MASK];
3967  if (b) {
3968  s = b >> 9;
3969  a->code_buffer >>= s;
3970  a->num_bits -= s;
3971  return b & 511;
3972  }
3973  return stbi__zhuffman_decode_slowpath(a, z);
3974 }
3975 
3976 static int stbi__zexpand(stbi__zbuf* z, char* zout, int n) // need to make room for n bytes
3977 {
3978  char* q;
3979  int cur, limit, old_limit;
3980  z->zout = zout;
3981  if (!z->z_expandable) return stbi__err("output buffer limit", "Corrupt PNG");
3982  cur = (int)(z->zout - z->zout_start);
3983  limit = old_limit = (int)(z->zout_end - z->zout_start);
3984  while (cur + n > limit)
3985  limit *= 2;
3986  q = (char*)STBI_REALLOC_SIZED(z->zout_start, old_limit, limit);
3987  STBI_NOTUSED(old_limit);
3988  if (q == NULL) return stbi__err("outofmem", "Out of memory");
3989  z->zout_start = q;
3990  z->zout = q + cur;
3991  z->zout_end = q + limit;
3992  return 1;
3993 }
3994 
3995 static const int stbi__zlength_base[31] = {
3996  3,4,5,6,7,8,9,10,11,13,
3997  15,17,19,23,27,31,35,43,51,59,
3998  67,83,99,115,131,163,195,227,258,0,0 };
3999 
4000 static const int stbi__zlength_extra[31] =
4001 { 0,0,0,0,0,0,0,0,1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5,0,0,0 };
4002 
4003 static const int stbi__zdist_base[32] = { 1,2,3,4,5,7,9,13,17,25,33,49,65,97,129,193,
4004 257,385,513,769,1025,1537,2049,3073,4097,6145,8193,12289,16385,24577,0,0 };
4005 
4006 static const int stbi__zdist_extra[32] =
4007 { 0,0,0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12,13,13 };
4008 
4009 static int stbi__parse_huffman_block(stbi__zbuf* a)
4010 {
4011  char* zout = a->zout;
4012  for (;;) {
4013  int z = stbi__zhuffman_decode(a, &a->z_length);
4014  if (z < 256) {
4015  if (z < 0) return stbi__err("bad huffman code", "Corrupt PNG"); // error in huffman codes
4016  if (zout >= a->zout_end) {
4017  if (!stbi__zexpand(a, zout, 1)) return 0;
4018  zout = a->zout;
4019  }
4020  *zout++ = (char)z;
4021  }
4022  else {
4023  stbi_uc* p;
4024  int len, dist;
4025  if (z == 256) {
4026  a->zout = zout;
4027  return 1;
4028  }
4029  z -= 257;
4030  len = stbi__zlength_base[z];
4031  if (stbi__zlength_extra[z]) len += stbi__zreceive(a, stbi__zlength_extra[z]);
4032  z = stbi__zhuffman_decode(a, &a->z_distance);
4033  if (z < 0) return stbi__err("bad huffman code", "Corrupt PNG");
4034  dist = stbi__zdist_base[z];
4035  if (stbi__zdist_extra[z]) dist += stbi__zreceive(a, stbi__zdist_extra[z]);
4036  if (zout - a->zout_start < dist) return stbi__err("bad dist", "Corrupt PNG");
4037  if (zout + len > a->zout_end) {
4038  if (!stbi__zexpand(a, zout, len)) return 0;
4039  zout = a->zout;
4040  }
4041  p = (stbi_uc*)(zout - dist);
4042  if (dist == 1) { // run of one byte; common in images.
4043  stbi_uc v = *p;
4044  if (len) { do *zout++ = v; while (--len); }
4045  }
4046  else {
4047  if (len) { do *zout++ = *p++; while (--len); }
4048  }
4049  }
4050  }
4051 }
4052 
4053 static int stbi__compute_huffman_codes(stbi__zbuf* a)
4054 {
4055  static const stbi_uc length_dezigzag[19] = { 16,17,18,0,8,7,9,6,10,5,11,4,12,3,13,2,14,1,15 };
4056  stbi__zhuffman z_codelength;
4057  stbi_uc lencodes[286 + 32 + 137];//padding for maximum single op
4058  stbi_uc codelength_sizes[19];
4059  int i, n;
4060 
4061  int hlit = stbi__zreceive(a, 5) + 257;
4062  int hdist = stbi__zreceive(a, 5) + 1;
4063  int hclen = stbi__zreceive(a, 4) + 4;
4064  int ntot = hlit + hdist;
4065 
4066  memset(codelength_sizes, 0, sizeof(codelength_sizes));
4067  for (i = 0; i < hclen; ++i) {
4068  int s = stbi__zreceive(a, 3);
4069  codelength_sizes[length_dezigzag[i]] = (stbi_uc)s;
4070  }
4071  if (!stbi__zbuild_huffman(&z_codelength, codelength_sizes, 19)) return 0;
4072 
4073  n = 0;
4074  while (n < ntot) {
4075  int c = stbi__zhuffman_decode(a, &z_codelength);
4076  if (c < 0 || c >= 19) return stbi__err("bad codelengths", "Corrupt PNG");
4077  if (c < 16)
4078  lencodes[n++] = (stbi_uc)c;
4079  else {
4080  stbi_uc fill = 0;
4081  if (c == 16) {
4082  c = stbi__zreceive(a, 2) + 3;
4083  if (n == 0) return stbi__err("bad codelengths", "Corrupt PNG");
4084  fill = lencodes[n - 1];
4085  }
4086  else if (c == 17)
4087  c = stbi__zreceive(a, 3) + 3;
4088  else {
4089  STBI_ASSERT(c == 18);
4090  c = stbi__zreceive(a, 7) + 11;
4091  }
4092  if (ntot - n < c) return stbi__err("bad codelengths", "Corrupt PNG");
4093  memset(lencodes + n, fill, c);
4094  n += c;
4095  }
4096  }
4097  if (n != ntot) return stbi__err("bad codelengths", "Corrupt PNG");
4098  if (!stbi__zbuild_huffman(&a->z_length, lencodes, hlit)) return 0;
4099  if (!stbi__zbuild_huffman(&a->z_distance, lencodes + hlit, hdist)) return 0;
4100  return 1;
4101 }
4102 
4103 static int stbi__parse_uncompressed_block(stbi__zbuf* a)
4104 {
4105  stbi_uc header[4];
4106  int len, nlen, k;
4107  if (a->num_bits & 7)
4108  stbi__zreceive(a, a->num_bits & 7); // discard
4109  // drain the bit-packed data into header
4110  k = 0;
4111  while (a->num_bits > 0) {
4112  header[k++] = (stbi_uc)(a->code_buffer & 255); // suppress MSVC run-time check
4113  a->code_buffer >>= 8;
4114  a->num_bits -= 8;
4115  }
4116  STBI_ASSERT(a->num_bits == 0);
4117  // now fill header the normal way
4118  while (k < 4)
4119  header[k++] = stbi__zget8(a);
4120  len = header[1] * 256 + header[0];
4121  nlen = header[3] * 256 + header[2];
4122  if (nlen != (len ^ 0xffff)) return stbi__err("zlib corrupt", "Corrupt PNG");
4123  if (a->zbuffer + len > a->zbuffer_end) return stbi__err("read past buffer", "Corrupt PNG");
4124  if (a->zout + len > a->zout_end)
4125  if (!stbi__zexpand(a, a->zout, len)) return 0;
4126  memcpy(a->zout, a->zbuffer, len);
4127  a->zbuffer += len;
4128  a->zout += len;
4129  return 1;
4130 }
4131 
4132 static int stbi__parse_zlib_header(stbi__zbuf* a)
4133 {
4134  int cmf = stbi__zget8(a);
4135  int cm = cmf & 15;
4136  /* int cinfo = cmf >> 4; */
4137  int flg = stbi__zget8(a);
4138  if ((cmf * 256 + flg) % 31 != 0) return stbi__err("bad zlib header", "Corrupt PNG"); // zlib spec
4139  if (flg & 32) return stbi__err("no preset dict", "Corrupt PNG"); // preset dictionary not allowed in png
4140  if (cm != 8) return stbi__err("bad compression", "Corrupt PNG"); // DEFLATE required for png
4141  // window = 1 << (8 + cinfo)... but who cares, we fully buffer output
4142  return 1;
4143 }
4144 
4145 static const stbi_uc stbi__zdefault_length[288] =
4146 {
4147  8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8, 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,
4148  8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8, 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,
4149  8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8, 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,
4150  8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8, 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,
4151  8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8, 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,
4152  9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9, 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,
4153  9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9, 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,
4154  9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9, 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,
4155  7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7, 7,7,7,7,7,7,7,7,8,8,8,8,8,8,8,8
4156 };
4157 static const stbi_uc stbi__zdefault_distance[32] =
4158 {
4159  5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5
4160 };
4161 /*
4162 Init algorithm:
4163 {
4164 int i; // use <= to match clearly with spec
4165 for (i=0; i <= 143; ++i) stbi__zdefault_length[i] = 8;
4166 for ( ; i <= 255; ++i) stbi__zdefault_length[i] = 9;
4167 for ( ; i <= 279; ++i) stbi__zdefault_length[i] = 7;
4168 for ( ; i <= 287; ++i) stbi__zdefault_length[i] = 8;
4169 
4170 for (i=0; i <= 31; ++i) stbi__zdefault_distance[i] = 5;
4171 }
4172 */
4173 
4174 static int stbi__parse_zlib(stbi__zbuf* a, int parse_header)
4175 {
4176  int final, type;
4177  if (parse_header)
4178  if (!stbi__parse_zlib_header(a)) return 0;
4179  a->num_bits = 0;
4180  a->code_buffer = 0;
4181  do {
4182  final = stbi__zreceive(a, 1);
4183  type = stbi__zreceive(a, 2);
4184  if (type == 0) {
4185  if (!stbi__parse_uncompressed_block(a)) return 0;
4186  }
4187  else if (type == 3) {
4188  return 0;
4189  }
4190  else {
4191  if (type == 1) {
4192  // use fixed code lengths
4193  if (!stbi__zbuild_huffman(&a->z_length, stbi__zdefault_length, 288)) return 0;
4194  if (!stbi__zbuild_huffman(&a->z_distance, stbi__zdefault_distance, 32)) return 0;
4195  }
4196  else {
4197  if (!stbi__compute_huffman_codes(a)) return 0;
4198  }
4199  if (!stbi__parse_huffman_block(a)) return 0;
4200  }
4201  } while (!final);
4202  return 1;
4203 }
4204 
4205 static int stbi__do_zlib(stbi__zbuf* a, char* obuf, int olen, int exp, int parse_header)
4206 {
4207  a->zout_start = obuf;
4208  a->zout = obuf;
4209  a->zout_end = obuf + olen;
4210  a->z_expandable = exp;
4211 
4212  return stbi__parse_zlib(a, parse_header);
4213 }
4214 
4215 STBIDEF char* stbi_zlib_decode_malloc_guesssize(const char* buffer, int len, int initial_size, int* outlen)
4216 {
4217  stbi__zbuf a;
4218  char* p = (char*)stbi__malloc(initial_size);
4219  if (p == NULL) return NULL;
4220  a.zbuffer = (stbi_uc*)buffer;
4221  a.zbuffer_end = (stbi_uc*)buffer + len;
4222  if (stbi__do_zlib(&a, p, initial_size, 1, 1)) {
4223  if (outlen)* outlen = (int)(a.zout - a.zout_start);
4224  return a.zout_start;
4225  }
4226  else {
4227  STBI_FREE(a.zout_start);
4228  return NULL;
4229  }
4230 }
4231 
4232 STBIDEF char* stbi_zlib_decode_malloc(char const* buffer, int len, int* outlen)
4233 {
4234  return stbi_zlib_decode_malloc_guesssize(buffer, len, 16384, outlen);
4235 }
4236 
4237 STBIDEF char* stbi_zlib_decode_malloc_guesssize_headerflag(const char* buffer, int len, int initial_size, int* outlen, int parse_header)
4238 {
4239  stbi__zbuf a;
4240  char* p = (char*)stbi__malloc(initial_size);
4241  if (p == NULL) return NULL;
4242  a.zbuffer = (stbi_uc*)buffer;
4243  a.zbuffer_end = (stbi_uc*)buffer + len;
4244  if (stbi__do_zlib(&a, p, initial_size, 1, parse_header)) {
4245  if (outlen)* outlen = (int)(a.zout - a.zout_start);
4246  return a.zout_start;
4247  }
4248  else {
4249  STBI_FREE(a.zout_start);
4250  return NULL;
4251  }
4252 }
4253 
4254 STBIDEF int stbi_zlib_decode_buffer(char* obuffer, int olen, char const* ibuffer, int ilen)
4255 {
4256  stbi__zbuf a;
4257  a.zbuffer = (stbi_uc*)ibuffer;
4258  a.zbuffer_end = (stbi_uc*)ibuffer + ilen;
4259  if (stbi__do_zlib(&a, obuffer, olen, 0, 1))
4260  return (int)(a.zout - a.zout_start);
4261  else
4262  return -1;
4263 }
4264 
4265 STBIDEF char* stbi_zlib_decode_noheader_malloc(char const* buffer, int len, int* outlen)
4266 {
4267  stbi__zbuf a;
4268  char* p = (char*)stbi__malloc(16384);
4269  if (p == NULL) return NULL;
4270  a.zbuffer = (stbi_uc*)buffer;
4271  a.zbuffer_end = (stbi_uc*)buffer + len;
4272  if (stbi__do_zlib(&a, p, 16384, 1, 0)) {
4273  if (outlen)* outlen = (int)(a.zout - a.zout_start);
4274  return a.zout_start;
4275  }
4276  else {
4277  STBI_FREE(a.zout_start);
4278  return NULL;
4279  }
4280 }
4281 
4282 STBIDEF int stbi_zlib_decode_noheader_buffer(char* obuffer, int olen, const char* ibuffer, int ilen)
4283 {
4284  stbi__zbuf a;
4285  a.zbuffer = (stbi_uc*)ibuffer;
4286  a.zbuffer_end = (stbi_uc*)ibuffer + ilen;
4287  if (stbi__do_zlib(&a, obuffer, olen, 0, 0))
4288  return (int)(a.zout - a.zout_start);
4289  else
4290  return -1;
4291 }
4292 #endif
4293 
4294 // public domain "baseline" PNG decoder v0.10 Sean Barrett 2006-11-18
4295 // simple implementation
4296 // - only 8-bit samples
4297 // - no CRC checking
4298 // - allocates lots of intermediate memory
4299 // - avoids problem of streaming data between subsystems
4300 // - avoids explicit window management
4301 // performance
4302 // - uses stb_zlib, a PD zlib implementation with fast huffman decoding
4303 
4304 #ifndef STBI_NO_PNG
4305 typedef struct
4306 {
4307  stbi__uint32 length;
4308  stbi__uint32 type;
4309 } stbi__pngchunk;
4310 
4311 static stbi__pngchunk stbi__get_chunk_header(stbi__context* s)
4312 {
4313  stbi__pngchunk c;
4314  c.length = stbi__get32be(s);
4315  c.type = stbi__get32be(s);
4316  return c;
4317 }
4318 
4319 static int stbi__check_png_header(stbi__context* s)
4320 {
4321  static const stbi_uc png_sig[8] = { 137,80,78,71,13,10,26,10 };
4322  int i;
4323  for (i = 0; i < 8; ++i)
4324  if (stbi__get8(s) != png_sig[i]) return stbi__err("bad png sig", "Not a PNG");
4325  return 1;
4326 }
4327 
4328 typedef struct
4329 {
4330  stbi__context* s;
4331  stbi_uc* idata, * expanded, * out;
4332  int depth;
4333 } stbi__png;
4334 
4335 
4336 enum {
4337  STBI__F_none = 0,
4338  STBI__F_sub = 1,
4339  STBI__F_up = 2,
4340  STBI__F_avg = 3,
4341  STBI__F_paeth = 4,
4342  // synthetic filters used for first scanline to avoid needing a dummy row of 0s
4343  STBI__F_avg_first,
4344  STBI__F_paeth_first
4345 };
4346 
4347 static stbi_uc first_row_filter[5] =
4348 {
4349  STBI__F_none,
4350  STBI__F_sub,
4351  STBI__F_none,
4352  STBI__F_avg_first,
4353  STBI__F_paeth_first
4354 };
4355 
4356 static int stbi__paeth(int a, int b, int c)
4357 {
4358  int p = a + b - c;
4359  int pa = abs(p - a);
4360  int pb = abs(p - b);
4361  int pc = abs(p - c);
4362  if (pa <= pb && pa <= pc) return a;
4363  if (pb <= pc) return b;
4364  return c;
4365 }
4366 
4367 static const stbi_uc stbi__depth_scale_table[9] = { 0, 0xff, 0x55, 0, 0x11, 0,0,0, 0x01 };
4368 
4369 // create the png data from post-deflated data
4370 static int stbi__create_png_image_raw(stbi__png* a, stbi_uc* raw, stbi__uint32 raw_len, int out_n, stbi__uint32 x, stbi__uint32 y, int depth, int color)
4371 {
4372  int bytes = (depth == 16 ? 2 : 1);
4373  stbi__context* s = a->s;
4374  stbi__uint32 i, j, stride = x * out_n * bytes;
4375  stbi__uint32 img_len, img_width_bytes;
4376  int k;
4377  int img_n = s->img_n; // copy it into a local for later
4378 
4379  int output_bytes = out_n * bytes;
4380  int filter_bytes = img_n * bytes;
4381  int width = x;
4382 
4383  STBI_ASSERT(out_n == s->img_n || out_n == s->img_n + 1);
4384  a->out = (stbi_uc*)stbi__malloc_mad3(x, y, output_bytes, 0); // extra bytes to write off the end into
4385  if (!a->out) return stbi__err("outofmem", "Out of memory");
4386 
4387  if (!stbi__mad3sizes_valid(img_n, x, depth, 7)) return stbi__err("too large", "Corrupt PNG");
4388  img_width_bytes = (((img_n * x * depth) + 7) >> 3);
4389  img_len = (img_width_bytes + 1) * y;
4390 
4391  // we used to check for exact match between raw_len and img_len on non-interlaced PNGs,
4392  // but issue #276 reported a PNG in the wild that had extra data at the end (all zeros),
4393  // so just check for raw_len < img_len always.
4394  if (raw_len < img_len) return stbi__err("not enough pixels", "Corrupt PNG");
4395 
4396  for (j = 0; j < y; ++j) {
4397  stbi_uc* cur = a->out + stride * j;
4398  stbi_uc* prior;
4399  int filter = *raw++;
4400 
4401  if (filter > 4)
4402  return stbi__err("invalid filter", "Corrupt PNG");
4403 
4404  if (depth < 8) {
4405  STBI_ASSERT(img_width_bytes <= x);
4406  cur += x * out_n - img_width_bytes; // store output to the rightmost img_len bytes, so we can decode in place
4407  filter_bytes = 1;
4408  width = img_width_bytes;
4409  }
4410  prior = cur - stride; // bugfix: need to compute this after 'cur +=' computation above
4411 
4412  // if first row, use special filter that doesn't sample previous row
4413  if (j == 0) filter = first_row_filter[filter];
4414 
4415  // handle first byte explicitly
4416  for (k = 0; k < filter_bytes; ++k) {
4417  switch (filter) {
4418  case STBI__F_none: cur[k] = raw[k]; break;
4419  case STBI__F_sub: cur[k] = raw[k]; break;
4420  case STBI__F_up: cur[k] = STBI__BYTECAST(raw[k] + prior[k]); break;
4421  case STBI__F_avg: cur[k] = STBI__BYTECAST(raw[k] + (prior[k] >> 1)); break;
4422  case STBI__F_paeth: cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(0, prior[k], 0)); break;
4423  case STBI__F_avg_first: cur[k] = raw[k]; break;
4424  case STBI__F_paeth_first: cur[k] = raw[k]; break;
4425  }
4426  }
4427 
4428  if (depth == 8) {
4429  if (img_n != out_n)
4430  cur[img_n] = 255; // first pixel
4431  raw += img_n;
4432  cur += out_n;
4433  prior += out_n;
4434  }
4435  else if (depth == 16) {
4436  if (img_n != out_n) {
4437  cur[filter_bytes] = 255; // first pixel top byte
4438  cur[filter_bytes + 1] = 255; // first pixel bottom byte
4439  }
4440  raw += filter_bytes;
4441  cur += output_bytes;
4442  prior += output_bytes;
4443  }
4444  else {
4445  raw += 1;
4446  cur += 1;
4447  prior += 1;
4448  }
4449 
4450  // this is a little gross, so that we don't switch per-pixel or per-component
4451  if (depth < 8 || img_n == out_n) {
4452  int nk = (width - 1) * filter_bytes;
4453 #define STBI__CASE(f) \
4454  case f: \
4455  for (k=0; k < nk; ++k)
4456  switch (filter) {
4457  // "none" filter turns into a memcpy here; make that explicit.
4458  case STBI__F_none: memcpy(cur, raw, nk); break;
4459  STBI__CASE(STBI__F_sub) { cur[k] = STBI__BYTECAST(raw[k] + cur[k - filter_bytes]); } break;
4460  STBI__CASE(STBI__F_up) { cur[k] = STBI__BYTECAST(raw[k] + prior[k]); } break;
4461  STBI__CASE(STBI__F_avg) { cur[k] = STBI__BYTECAST(raw[k] + ((prior[k] + cur[k - filter_bytes]) >> 1)); } break;
4462  STBI__CASE(STBI__F_paeth) { cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k - filter_bytes], prior[k], prior[k - filter_bytes])); } break;
4463  STBI__CASE(STBI__F_avg_first) { cur[k] = STBI__BYTECAST(raw[k] + (cur[k - filter_bytes] >> 1)); } break;
4464  STBI__CASE(STBI__F_paeth_first) { cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k - filter_bytes], 0, 0)); } break;
4465  }
4466 #undef STBI__CASE
4467  raw += nk;
4468  }
4469  else {
4470  STBI_ASSERT(img_n + 1 == out_n);
4471 #define STBI__CASE(f) \
4472  case f: \
4473  for (i=x-1; i >= 1; --i, cur[filter_bytes]=255,raw+=filter_bytes,cur+=output_bytes,prior+=output_bytes) \
4474  for (k=0; k < filter_bytes; ++k)
4475  switch (filter) {
4476  STBI__CASE(STBI__F_none) { cur[k] = raw[k]; } break;
4477  STBI__CASE(STBI__F_sub) { cur[k] = STBI__BYTECAST(raw[k] + cur[k - output_bytes]); } break;
4478  STBI__CASE(STBI__F_up) { cur[k] = STBI__BYTECAST(raw[k] + prior[k]); } break;
4479  STBI__CASE(STBI__F_avg) { cur[k] = STBI__BYTECAST(raw[k] + ((prior[k] + cur[k - output_bytes]) >> 1)); } break;
4480  STBI__CASE(STBI__F_paeth) { cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k - output_bytes], prior[k], prior[k - output_bytes])); } break;
4481  STBI__CASE(STBI__F_avg_first) { cur[k] = STBI__BYTECAST(raw[k] + (cur[k - output_bytes] >> 1)); } break;
4482  STBI__CASE(STBI__F_paeth_first) { cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k - output_bytes], 0, 0)); } break;
4483  }
4484 #undef STBI__CASE
4485 
4486  // the loop above sets the high byte of the pixels' alpha, but for
4487  // 16 bit png files we also need the low byte set. we'll do that here.
4488  if (depth == 16) {
4489  cur = a->out + stride * j; // start at the beginning of the row again
4490  for (i = 0; i < x; ++i, cur += output_bytes) {
4491  cur[filter_bytes + 1] = 255;
4492  }
4493  }
4494  }
4495  }
4496 
4497  // we make a separate pass to expand bits to pixels; for performance,
4498  // this could run two scanlines behind the above code, so it won't
4499  // intefere with filtering but will still be in the cache.
4500  if (depth < 8) {
4501  for (j = 0; j < y; ++j) {
4502  stbi_uc* cur = a->out + stride * j;
4503  stbi_uc* in = a->out + stride * j + x * out_n - img_width_bytes;
4504  // unpack 1/2/4-bit into a 8-bit buffer. allows us to keep the common 8-bit path optimal at minimal cost for 1/2/4-bit
4505  // png guarante byte alignment, if width is not multiple of 8/4/2 we'll decode dummy trailing data that will be skipped in the later loop
4506  stbi_uc scale = (color == 0) ? stbi__depth_scale_table[depth] : 1; // scale grayscale values to 0..255 range
4507 
4508  // note that the final byte might overshoot and write more data than desired.
4509  // we can allocate enough data that this never writes out of memory, but it
4510  // could also overwrite the next scanline. can it overwrite non-empty data
4511  // on the next scanline? yes, consider 1-pixel-wide scanlines with 1-bit-per-pixel.
4512  // so we need to explicitly clamp the final ones
4513 
4514  if (depth == 4) {
4515  for (k = x * img_n; k >= 2; k -= 2, ++in) {
4516  *cur++ = scale * ((*in >> 4));
4517  *cur++ = scale * ((*in) & 0x0f);
4518  }
4519  if (k > 0)* cur++ = scale * ((*in >> 4));
4520  }
4521  else if (depth == 2) {
4522  for (k = x * img_n; k >= 4; k -= 4, ++in) {
4523  *cur++ = scale * ((*in >> 6));
4524  *cur++ = scale * ((*in >> 4) & 0x03);
4525  *cur++ = scale * ((*in >> 2) & 0x03);
4526  *cur++ = scale * ((*in) & 0x03);
4527  }
4528  if (k > 0)* cur++ = scale * ((*in >> 6));
4529  if (k > 1)* cur++ = scale * ((*in >> 4) & 0x03);
4530  if (k > 2)* cur++ = scale * ((*in >> 2) & 0x03);
4531  }
4532  else if (depth == 1) {
4533  for (k = x * img_n; k >= 8; k -= 8, ++in) {
4534  *cur++ = scale * ((*in >> 7));
4535  *cur++ = scale * ((*in >> 6) & 0x01);
4536  *cur++ = scale * ((*in >> 5) & 0x01);
4537  *cur++ = scale * ((*in >> 4) & 0x01);
4538  *cur++ = scale * ((*in >> 3) & 0x01);
4539  *cur++ = scale * ((*in >> 2) & 0x01);
4540  *cur++ = scale * ((*in >> 1) & 0x01);
4541  *cur++ = scale * ((*in) & 0x01);
4542  }
4543  if (k > 0)* cur++ = scale * ((*in >> 7));
4544  if (k > 1)* cur++ = scale * ((*in >> 6) & 0x01);
4545  if (k > 2)* cur++ = scale * ((*in >> 5) & 0x01);
4546  if (k > 3)* cur++ = scale * ((*in >> 4) & 0x01);
4547  if (k > 4)* cur++ = scale * ((*in >> 3) & 0x01);
4548  if (k > 5)* cur++ = scale * ((*in >> 2) & 0x01);
4549  if (k > 6)* cur++ = scale * ((*in >> 1) & 0x01);
4550  }
4551  if (img_n != out_n) {
4552  int q;
4553  // insert alpha = 255
4554  cur = a->out + stride * j;
4555  if (img_n == 1) {
4556  for (q = x - 1; q >= 0; --q) {
4557  cur[q * 2 + 1] = 255;
4558  cur[q * 2 + 0] = cur[q];
4559  }
4560  }
4561  else {
4562  STBI_ASSERT(img_n == 3);
4563  for (q = x - 1; q >= 0; --q) {
4564  cur[q * 4 + 3] = 255;
4565  cur[q * 4 + 2] = cur[q * 3 + 2];
4566  cur[q * 4 + 1] = cur[q * 3 + 1];
4567  cur[q * 4 + 0] = cur[q * 3 + 0];
4568  }
4569  }
4570  }
4571  }
4572  }
4573  else if (depth == 16) {
4574  // force the image data from big-endian to platform-native.
4575  // this is done in a separate pass due to the decoding relying
4576  // on the data being untouched, but could probably be done
4577  // per-line during decode if care is taken.
4578  stbi_uc* cur = a->out;
4579  stbi__uint16* cur16 = (stbi__uint16*)cur;
4580 
4581  for (i = 0; i < x * y * out_n; ++i, cur16++, cur += 2) {
4582  *cur16 = (cur[0] << 8) | cur[1];
4583  }
4584  }
4585 
4586  return 1;
4587 }
4588 
4589 static int stbi__create_png_image(stbi__png* a, stbi_uc* image_data, stbi__uint32 image_data_len, int out_n, int depth, int color, int interlaced)
4590 {
4591  int bytes = (depth == 16 ? 2 : 1);
4592  int out_bytes = out_n * bytes;
4593  stbi_uc* final;
4594  int p;
4595  if (!interlaced)
4596  return stbi__create_png_image_raw(a, image_data, image_data_len, out_n, a->s->img_x, a->s->img_y, depth, color);
4597 
4598  // de-interlacing
4599  final = (stbi_uc*)stbi__malloc_mad3(a->s->img_x, a->s->img_y, out_bytes, 0);
4600  for (p = 0; p < 7; ++p) {
4601  int xorig[] = { 0,4,0,2,0,1,0 };
4602  int yorig[] = { 0,0,4,0,2,0,1 };
4603  int xspc[] = { 8,8,4,4,2,2,1 };
4604  int yspc[] = { 8,8,8,4,4,2,2 };
4605  int i, j, x, y;
4606  // pass1_x[4] = 0, pass1_x[5] = 1, pass1_x[12] = 1
4607  x = (a->s->img_x - xorig[p] + xspc[p] - 1) / xspc[p];
4608  y = (a->s->img_y - yorig[p] + yspc[p] - 1) / yspc[p];
4609  if (x && y) {
4610  stbi__uint32 img_len = ((((a->s->img_n * x * depth) + 7) >> 3) + 1) * y;
4611  if (!stbi__create_png_image_raw(a, image_data, image_data_len, out_n, x, y, depth, color)) {
4612  STBI_FREE(final);
4613  return 0;
4614  }
4615  for (j = 0; j < y; ++j) {
4616  for (i = 0; i < x; ++i) {
4617  int out_y = j * yspc[p] + yorig[p];
4618  int out_x = i * xspc[p] + xorig[p];
4619  memcpy(final + out_y * a->s->img_x * out_bytes + out_x * out_bytes,
4620  a->out + (j * x + i) * out_bytes, out_bytes);
4621  }
4622  }
4623  STBI_FREE(a->out);
4624  image_data += img_len;
4625  image_data_len -= img_len;
4626  }
4627  }
4628  a->out = final;
4629 
4630  return 1;
4631 }
4632 
4633 static int stbi__compute_transparency(stbi__png* z, stbi_uc tc[3], int out_n)
4634 {
4635  stbi__context* s = z->s;
4636  stbi__uint32 i, pixel_count = s->img_x * s->img_y;
4637  stbi_uc* p = z->out;
4638 
4639  // compute color-based transparency, assuming we've
4640  // already got 255 as the alpha value in the output
4641  STBI_ASSERT(out_n == 2 || out_n == 4);
4642 
4643  if (out_n == 2) {
4644  for (i = 0; i < pixel_count; ++i) {
4645  p[1] = (p[0] == tc[0] ? 0 : 255);
4646  p += 2;
4647  }
4648  }
4649  else {
4650  for (i = 0; i < pixel_count; ++i) {
4651  if (p[0] == tc[0] && p[1] == tc[1] && p[2] == tc[2])
4652  p[3] = 0;
4653  p += 4;
4654  }
4655  }
4656  return 1;
4657 }
4658 
4659 static int stbi__compute_transparency16(stbi__png* z, stbi__uint16 tc[3], int out_n)
4660 {
4661  stbi__context* s = z->s;
4662  stbi__uint32 i, pixel_count = s->img_x * s->img_y;
4663  stbi__uint16* p = (stbi__uint16*)z->out;
4664 
4665  // compute color-based transparency, assuming we've
4666  // already got 65535 as the alpha value in the output
4667  STBI_ASSERT(out_n == 2 || out_n == 4);
4668 
4669  if (out_n == 2) {
4670  for (i = 0; i < pixel_count; ++i) {
4671  p[1] = (p[0] == tc[0] ? 0 : 65535);
4672  p += 2;
4673  }
4674  }
4675  else {
4676  for (i = 0; i < pixel_count; ++i) {
4677  if (p[0] == tc[0] && p[1] == tc[1] && p[2] == tc[2])
4678  p[3] = 0;
4679  p += 4;
4680  }
4681  }
4682  return 1;
4683 }
4684 
4685 static int stbi__expand_png_palette(stbi__png* a, stbi_uc* palette, int len, int pal_img_n)
4686 {
4687  stbi__uint32 i, pixel_count = a->s->img_x * a->s->img_y;
4688  stbi_uc* p, * temp_out, * orig = a->out;
4689 
4690  p = (stbi_uc*)stbi__malloc_mad2(pixel_count, pal_img_n, 0);
4691  if (p == NULL) return stbi__err("outofmem", "Out of memory");
4692 
4693  // between here and free(out) below, exitting would leak
4694  temp_out = p;
4695 
4696  if (pal_img_n == 3) {
4697  for (i = 0; i < pixel_count; ++i) {
4698  int n = orig[i] * 4;
4699  p[0] = palette[n];
4700  p[1] = palette[n + 1];
4701  p[2] = palette[n + 2];
4702  p += 3;
4703  }
4704  }
4705  else {
4706  for (i = 0; i < pixel_count; ++i) {
4707  int n = orig[i] * 4;
4708  p[0] = palette[n];
4709  p[1] = palette[n + 1];
4710  p[2] = palette[n + 2];
4711  p[3] = palette[n + 3];
4712  p += 4;
4713  }
4714  }
4715  STBI_FREE(a->out);
4716  a->out = temp_out;
4717 
4718  STBI_NOTUSED(len);
4719 
4720  return 1;
4721 }
4722 
4723 static int stbi__unpremultiply_on_load = 0;
4724 static int stbi__de_iphone_flag = 0;
4725 
4726 STBIDEF void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply)
4727 {
4728  stbi__unpremultiply_on_load = flag_true_if_should_unpremultiply;
4729 }
4730 
4731 STBIDEF void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert)
4732 {
4733  stbi__de_iphone_flag = flag_true_if_should_convert;
4734 }
4735 
4736 static void stbi__de_iphone(stbi__png* z)
4737 {
4738  stbi__context* s = z->s;
4739  stbi__uint32 i, pixel_count = s->img_x * s->img_y;
4740  stbi_uc* p = z->out;
4741 
4742  if (s->img_out_n == 3) { // convert bgr to rgb
4743  for (i = 0; i < pixel_count; ++i) {
4744  stbi_uc t = p[0];
4745  p[0] = p[2];
4746  p[2] = t;
4747  p += 3;
4748  }
4749  }
4750  else {
4751  STBI_ASSERT(s->img_out_n == 4);
4752  if (stbi__unpremultiply_on_load) {
4753  // convert bgr to rgb and unpremultiply
4754  for (i = 0; i < pixel_count; ++i) {
4755  stbi_uc a = p[3];
4756  stbi_uc t = p[0];
4757  if (a) {
4758  stbi_uc half = a / 2;
4759  p[0] = (p[2] * 255 + half) / a;
4760  p[1] = (p[1] * 255 + half) / a;
4761  p[2] = (t * 255 + half) / a;
4762  }
4763  else {
4764  p[0] = p[2];
4765  p[2] = t;
4766  }
4767  p += 4;
4768  }
4769  }
4770  else {
4771  // convert bgr to rgb
4772  for (i = 0; i < pixel_count; ++i) {
4773  stbi_uc t = p[0];
4774  p[0] = p[2];
4775  p[2] = t;
4776  p += 4;
4777  }
4778  }
4779  }
4780 }
4781 
4782 #define STBI__PNG_TYPE(a,b,c,d) (((unsigned) (a) << 24) + ((unsigned) (b) << 16) + ((unsigned) (c) << 8) + (unsigned) (d))
4783 
4784 static int stbi__parse_png_file(stbi__png* z, int scan, int req_comp)
4785 {
4786  stbi_uc palette[1024], pal_img_n = 0;
4787  stbi_uc has_trans = 0, tc[3];
4788  stbi__uint16 tc16[3];
4789  stbi__uint32 ioff = 0, idata_limit = 0, i, pal_len = 0;
4790  int first = 1, k, interlace = 0, color = 0, is_iphone = 0;
4791  stbi__context* s = z->s;
4792 
4793  z->expanded = NULL;
4794  z->idata = NULL;
4795  z->out = NULL;
4796 
4797  if (!stbi__check_png_header(s)) return 0;
4798 
4799  if (scan == STBI__SCAN_type) return 1;
4800 
4801  for (;;) {
4802  stbi__pngchunk c = stbi__get_chunk_header(s);
4803  switch (c.type) {
4804  case STBI__PNG_TYPE('C', 'g', 'B', 'I'):
4805  is_iphone = 1;
4806  stbi__skip(s, c.length);
4807  break;
4808  case STBI__PNG_TYPE('I', 'H', 'D', 'R'): {
4809  int comp, filter;
4810  if (!first) return stbi__err("multiple IHDR", "Corrupt PNG");
4811  first = 0;
4812  if (c.length != 13) return stbi__err("bad IHDR len", "Corrupt PNG");
4813  s->img_x = stbi__get32be(s); if (s->img_x > (1 << 24)) return stbi__err("too large", "Very large image (corrupt?)");
4814  s->img_y = stbi__get32be(s); if (s->img_y > (1 << 24)) return stbi__err("too large", "Very large image (corrupt?)");
4815  z->depth = stbi__get8(s); if (z->depth != 1 && z->depth != 2 && z->depth != 4 && z->depth != 8 && z->depth != 16) return stbi__err("1/2/4/8/16-bit only", "PNG not supported: 1/2/4/8/16-bit only");
4816  color = stbi__get8(s); if (color > 6) return stbi__err("bad ctype", "Corrupt PNG");
4817  if (color == 3 && z->depth == 16) return stbi__err("bad ctype", "Corrupt PNG");
4818  if (color == 3) pal_img_n = 3; else if (color & 1) return stbi__err("bad ctype", "Corrupt PNG");
4819  comp = stbi__get8(s); if (comp) return stbi__err("bad comp method", "Corrupt PNG");
4820  filter = stbi__get8(s); if (filter) return stbi__err("bad filter method", "Corrupt PNG");
4821  interlace = stbi__get8(s); if (interlace > 1) return stbi__err("bad interlace method", "Corrupt PNG");
4822  if (!s->img_x || !s->img_y) return stbi__err("0-pixel image", "Corrupt PNG");
4823  if (!pal_img_n) {
4824  s->img_n = (color & 2 ? 3 : 1) + (color & 4 ? 1 : 0);
4825  if ((1 << 30) / s->img_x / s->img_n < s->img_y) return stbi__err("too large", "Image too large to decode");
4826  if (scan == STBI__SCAN_header) return 1;
4827  }
4828  else {
4829  // if paletted, then pal_n is our final components, and
4830  // img_n is # components to decompress/filter.
4831  s->img_n = 1;
4832  if ((1 << 30) / s->img_x / 4 < s->img_y) return stbi__err("too large", "Corrupt PNG");
4833  // if SCAN_header, have to scan to see if we have a tRNS
4834  }
4835  break;
4836  }
4837 
4838  case STBI__PNG_TYPE('P', 'L', 'T', 'E'): {
4839  if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4840  if (c.length > 256 * 3) return stbi__err("invalid PLTE", "Corrupt PNG");
4841  pal_len = c.length / 3;
4842  if (pal_len * 3 != c.length) return stbi__err("invalid PLTE", "Corrupt PNG");
4843  for (i = 0; i < pal_len; ++i) {
4844  palette[i * 4 + 0] = stbi__get8(s);
4845  palette[i * 4 + 1] = stbi__get8(s);
4846  palette[i * 4 + 2] = stbi__get8(s);
4847  palette[i * 4 + 3] = 255;
4848  }
4849  break;
4850  }
4851 
4852  case STBI__PNG_TYPE('t', 'R', 'N', 'S'): {
4853  if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4854  if (z->idata) return stbi__err("tRNS after IDAT", "Corrupt PNG");
4855  if (pal_img_n) {
4856  if (scan == STBI__SCAN_header) { s->img_n = 4; return 1; }
4857  if (pal_len == 0) return stbi__err("tRNS before PLTE", "Corrupt PNG");
4858  if (c.length > pal_len) return stbi__err("bad tRNS len", "Corrupt PNG");
4859  pal_img_n = 4;
4860  for (i = 0; i < c.length; ++i)
4861  palette[i * 4 + 3] = stbi__get8(s);
4862  }
4863  else {
4864  if (!(s->img_n & 1)) return stbi__err("tRNS with alpha", "Corrupt PNG");
4865  if (c.length != (stbi__uint32)s->img_n * 2) return stbi__err("bad tRNS len", "Corrupt PNG");
4866  has_trans = 1;
4867  if (z->depth == 16) {
4868  for (k = 0; k < s->img_n; ++k) tc16[k] = (stbi__uint16)stbi__get16be(s); // copy the values as-is
4869  }
4870  else {
4871  for (k = 0; k < s->img_n; ++k) tc[k] = (stbi_uc)(stbi__get16be(s) & 255) * stbi__depth_scale_table[z->depth]; // non 8-bit images will be larger
4872  }
4873  }
4874  break;
4875  }
4876 
4877  case STBI__PNG_TYPE('I', 'D', 'A', 'T'): {
4878  if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4879  if (pal_img_n && !pal_len) return stbi__err("no PLTE", "Corrupt PNG");
4880  if (scan == STBI__SCAN_header) { s->img_n = pal_img_n; return 1; }
4881  if ((int)(ioff + c.length) < (int)ioff) return 0;
4882  if (ioff + c.length > idata_limit) {
4883  stbi__uint32 idata_limit_old = idata_limit;
4884  stbi_uc* p;
4885  if (idata_limit == 0) idata_limit = c.length > 4096 ? c.length : 4096;
4886  while (ioff + c.length > idata_limit)
4887  idata_limit *= 2;
4888  STBI_NOTUSED(idata_limit_old);
4889  p = (stbi_uc*)STBI_REALLOC_SIZED(z->idata, idata_limit_old, idata_limit); if (p == NULL) return stbi__err("outofmem", "Out of memory");
4890  z->idata = p;
4891  }
4892  if (!stbi__getn(s, z->idata + ioff, c.length)) return stbi__err("outofdata", "Corrupt PNG");
4893  ioff += c.length;
4894  break;
4895  }
4896 
4897  case STBI__PNG_TYPE('I', 'E', 'N', 'D'): {
4898  stbi__uint32 raw_len, bpl;
4899  if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4900  if (scan != STBI__SCAN_load) return 1;
4901  if (z->idata == NULL) return stbi__err("no IDAT", "Corrupt PNG");
4902  // initial guess for decoded data size to avoid unnecessary reallocs
4903  bpl = (s->img_x * z->depth + 7) / 8; // bytes per line, per component
4904  raw_len = bpl * s->img_y * s->img_n /* pixels */ + s->img_y /* filter mode per row */;
4905  z->expanded = (stbi_uc*)stbi_zlib_decode_malloc_guesssize_headerflag((char*)z->idata, ioff, raw_len, (int*)& raw_len, !is_iphone);
4906  if (z->expanded == NULL) return 0; // zlib should set error
4907  STBI_FREE(z->idata); z->idata = NULL;
4908  if ((req_comp == s->img_n + 1 && req_comp != 3 && !pal_img_n) || has_trans)
4909  s->img_out_n = s->img_n + 1;
4910  else
4911  s->img_out_n = s->img_n;
4912  if (!stbi__create_png_image(z, z->expanded, raw_len, s->img_out_n, z->depth, color, interlace)) return 0;
4913  if (has_trans) {
4914  if (z->depth == 16) {
4915  if (!stbi__compute_transparency16(z, tc16, s->img_out_n)) return 0;
4916  }
4917  else {
4918  if (!stbi__compute_transparency(z, tc, s->img_out_n)) return 0;
4919  }
4920  }
4921  if (is_iphone && stbi__de_iphone_flag && s->img_out_n > 2)
4922  stbi__de_iphone(z);
4923  if (pal_img_n) {
4924  // pal_img_n == 3 or 4
4925  s->img_n = pal_img_n; // record the actual colors we had
4926  s->img_out_n = pal_img_n;
4927  if (req_comp >= 3) s->img_out_n = req_comp;
4928  if (!stbi__expand_png_palette(z, palette, pal_len, s->img_out_n))
4929  return 0;
4930  }
4931  else if (has_trans) {
4932  // non-paletted image with tRNS -> source image has (constant) alpha
4933  ++s->img_n;
4934  }
4935  STBI_FREE(z->expanded); z->expanded = NULL;
4936  return 1;
4937  }
4938 
4939  default:
4940  // if critical, fail
4941  if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4942  if ((c.type & (1 << 29)) == 0) {
4943 #ifndef STBI_NO_FAILURE_STRINGS
4944  // not threadsafe
4945  static char invalid_chunk[] = "XXXX PNG chunk not known";
4946  invalid_chunk[0] = STBI__BYTECAST(c.type >> 24);
4947  invalid_chunk[1] = STBI__BYTECAST(c.type >> 16);
4948  invalid_chunk[2] = STBI__BYTECAST(c.type >> 8);
4949  invalid_chunk[3] = STBI__BYTECAST(c.type >> 0);
4950 #endif
4951  return stbi__err(invalid_chunk, "PNG not supported: unknown PNG chunk type");
4952  }
4953  stbi__skip(s, c.length);
4954  break;
4955  }
4956  // end of PNG chunk, read and skip CRC
4957  stbi__get32be(s);
4958  }
4959 }
4960 
4961 static void* stbi__do_png(stbi__png* p, int* x, int* y, int* n, int req_comp, stbi__result_info* ri)
4962 {
4963  void* result = NULL;
4964  if (req_comp < 0 || req_comp > 4) return stbi__errpuc("bad req_comp", "Internal error");
4965  if (stbi__parse_png_file(p, STBI__SCAN_load, req_comp)) {
4966  if (p->depth < 8)
4967  ri->bits_per_channel = 8;
4968  else
4969  ri->bits_per_channel = p->depth;
4970  result = p->out;
4971  p->out = NULL;
4972  if (req_comp && req_comp != p->s->img_out_n) {
4973  if (ri->bits_per_channel == 8)
4974  result = stbi__convert_format((unsigned char*)result, p->s->img_out_n, req_comp, p->s->img_x, p->s->img_y);
4975  else
4976  result = stbi__convert_format16((stbi__uint16*)result, p->s->img_out_n, req_comp, p->s->img_x, p->s->img_y);
4977  p->s->img_out_n = req_comp;
4978  if (result == NULL) return result;
4979  }
4980  *x = p->s->img_x;
4981  *y = p->s->img_y;
4982  if (n)* n = p->s->img_n;
4983  }
4984  STBI_FREE(p->out); p->out = NULL;
4985  STBI_FREE(p->expanded); p->expanded = NULL;
4986  STBI_FREE(p->idata); p->idata = NULL;
4987 
4988  return result;
4989 }
4990 
4991 static void* stbi__png_load(stbi__context* s, int* x, int* y, int* comp, int req_comp, stbi__result_info* ri)
4992 {
4993  stbi__png p;
4994  p.s = s;
4995  return stbi__do_png(&p, x, y, comp, req_comp, ri);
4996 }
4997 
4998 static int stbi__png_test(stbi__context* s)
4999 {
5000  int r;
5001  r = stbi__check_png_header(s);
5002  stbi__rewind(s);
5003  return r;
5004 }
5005 
5006 static int stbi__png_info_raw(stbi__png* p, int* x, int* y, int* comp)
5007 {
5008  if (!stbi__parse_png_file(p, STBI__SCAN_header, 0)) {
5009  stbi__rewind(p->s);
5010  return 0;
5011  }
5012  if (x)* x = p->s->img_x;
5013  if (y)* y = p->s->img_y;
5014  if (comp)* comp = p->s->img_n;
5015  return 1;
5016 }
5017 
5018 static int stbi__png_info(stbi__context* s, int* x, int* y, int* comp)
5019 {
5020  stbi__png p;
5021  p.s = s;
5022  return stbi__png_info_raw(&p, x, y, comp);
5023 }
5024 
5025 static int stbi__png_is16(stbi__context* s)
5026 {
5027  stbi__png p;
5028  p.s = s;
5029  if (!stbi__png_info_raw(&p, NULL, NULL, NULL))
5030  return 0;
5031  if (p.depth != 16) {
5032  stbi__rewind(p.s);
5033  return 0;
5034  }
5035  return 1;
5036 }
5037 #endif
5038 
5039 // Microsoft/Windows BMP image
5040 
5041 #ifndef STBI_NO_BMP
5042 static int stbi__bmp_test_raw(stbi__context* s)
5043 {
5044  int r;
5045  int sz;
5046  if (stbi__get8(s) != 'B') return 0;
5047  if (stbi__get8(s) != 'M') return 0;
5048  stbi__get32le(s); // discard filesize
5049  stbi__get16le(s); // discard reserved
5050  stbi__get16le(s); // discard reserved
5051  stbi__get32le(s); // discard data offset
5052  sz = stbi__get32le(s);
5053  r = (sz == 12 || sz == 40 || sz == 56 || sz == 108 || sz == 124);
5054  return r;
5055 }
5056 
5057 static int stbi__bmp_test(stbi__context* s)
5058 {
5059  int r = stbi__bmp_test_raw(s);
5060  stbi__rewind(s);
5061  return r;
5062 }
5063 
5064 
5065 // returns 0..31 for the highest set bit
5066 static int stbi__high_bit(unsigned int z)
5067 {
5068  int n = 0;
5069  if (z == 0) return -1;
5070  if (z >= 0x10000) n += 16, z >>= 16;
5071  if (z >= 0x00100) n += 8, z >>= 8;
5072  if (z >= 0x00010) n += 4, z >>= 4;
5073  if (z >= 0x00004) n += 2, z >>= 2;
5074  if (z >= 0x00002) n += 1, z >>= 1;
5075  return n;
5076 }
5077 
5078 static int stbi__bitcount(unsigned int a)
5079 {
5080  a = (a & 0x55555555) + ((a >> 1) & 0x55555555); // max 2
5081  a = (a & 0x33333333) + ((a >> 2) & 0x33333333); // max 4
5082  a = (a + (a >> 4)) & 0x0f0f0f0f; // max 8 per 4, now 8 bits
5083  a = (a + (a >> 8)); // max 16 per 8 bits
5084  a = (a + (a >> 16)); // max 32 per 8 bits
5085  return a & 0xff;
5086 }
5087 
5088 // extract an arbitrarily-aligned N-bit value (N=bits)
5089 // from v, and then make it 8-bits long and fractionally
5090 // extend it to full full range.
5091 static int stbi__shiftsigned(int v, int shift, int bits)
5092 {
5093  static unsigned int mul_table[9] = {
5094  0,
5095  0xff/*0b11111111*/, 0x55/*0b01010101*/, 0x49/*0b01001001*/, 0x11/*0b00010001*/,
5096  0x21/*0b00100001*/, 0x41/*0b01000001*/, 0x81/*0b10000001*/, 0x01/*0b00000001*/,
5097  };
5098  static unsigned int shift_table[9] = {
5099  0, 0,0,1,0,2,4,6,0,
5100  };
5101  if (shift < 0)
5102  v <<= -shift;
5103  else
5104  v >>= shift;
5105  STBI_ASSERT(v >= 0 && v < 256);
5106  v >>= (8 - bits);
5107  STBI_ASSERT(bits >= 0 && bits <= 8);
5108  return (int)((unsigned)v * mul_table[bits]) >> shift_table[bits];
5109 }
5110 
5111 typedef struct
5112 {
5113  int bpp, offset, hsz;
5114  unsigned int mr, mg, mb, ma, all_a;
5115 } stbi__bmp_data;
5116 
5117 static void* stbi__bmp_parse_header(stbi__context* s, stbi__bmp_data* info)
5118 {
5119  int hsz;
5120  if (stbi__get8(s) != 'B' || stbi__get8(s) != 'M') return stbi__errpuc("not BMP", "Corrupt BMP");
5121  stbi__get32le(s); // discard filesize
5122  stbi__get16le(s); // discard reserved
5123  stbi__get16le(s); // discard reserved
5124  info->offset = stbi__get32le(s);
5125  info->hsz = hsz = stbi__get32le(s);
5126  info->mr = info->mg = info->mb = info->ma = 0;
5127 
5128  if (hsz != 12 && hsz != 40 && hsz != 56 && hsz != 108 && hsz != 124) return stbi__errpuc("unknown BMP", "BMP type not supported: unknown");
5129  if (hsz == 12) {
5130  s->img_x = stbi__get16le(s);
5131  s->img_y = stbi__get16le(s);
5132  }
5133  else {
5134  s->img_x = stbi__get32le(s);
5135  s->img_y = stbi__get32le(s);
5136  }
5137  if (stbi__get16le(s) != 1) return stbi__errpuc("bad BMP", "bad BMP");
5138  info->bpp = stbi__get16le(s);
5139  if (hsz != 12) {
5140  int compress = stbi__get32le(s);
5141  if (compress == 1 || compress == 2) return stbi__errpuc("BMP RLE", "BMP type not supported: RLE");
5142  stbi__get32le(s); // discard sizeof
5143  stbi__get32le(s); // discard hres
5144  stbi__get32le(s); // discard vres
5145  stbi__get32le(s); // discard colorsused
5146  stbi__get32le(s); // discard max important
5147  if (hsz == 40 || hsz == 56) {
5148  if (hsz == 56) {
5149  stbi__get32le(s);
5150  stbi__get32le(s);
5151  stbi__get32le(s);
5152  stbi__get32le(s);
5153  }
5154  if (info->bpp == 16 || info->bpp == 32) {
5155  if (compress == 0) {
5156  if (info->bpp == 32) {
5157  info->mr = 0xffu << 16;
5158  info->mg = 0xffu << 8;
5159  info->mb = 0xffu << 0;
5160  info->ma = 0xffu << 24;
5161  info->all_a = 0; // if all_a is 0 at end, then we loaded alpha channel but it was all 0
5162  }
5163  else {
5164  info->mr = 31u << 10;
5165  info->mg = 31u << 5;
5166  info->mb = 31u << 0;
5167  }
5168  }
5169  else if (compress == 3) {
5170  info->mr = stbi__get32le(s);
5171  info->mg = stbi__get32le(s);
5172  info->mb = stbi__get32le(s);
5173  // not documented, but generated by photoshop and handled by mspaint
5174  if (info->mr == info->mg && info->mg == info->mb) {
5175  // ?!?!?
5176  return stbi__errpuc("bad BMP", "bad BMP");
5177  }
5178  }
5179  else
5180  return stbi__errpuc("bad BMP", "bad BMP");
5181  }
5182  }
5183  else {
5184  int i;
5185  if (hsz != 108 && hsz != 124)
5186  return stbi__errpuc("bad BMP", "bad BMP");
5187  info->mr = stbi__get32le(s);
5188  info->mg = stbi__get32le(s);
5189  info->mb = stbi__get32le(s);
5190  info->ma = stbi__get32le(s);
5191  stbi__get32le(s); // discard color space
5192  for (i = 0; i < 12; ++i)
5193  stbi__get32le(s); // discard color space parameters
5194  if (hsz == 124) {
5195  stbi__get32le(s); // discard rendering intent
5196  stbi__get32le(s); // discard offset of profile data
5197  stbi__get32le(s); // discard size of profile data
5198  stbi__get32le(s); // discard reserved
5199  }
5200  }
5201  }
5202  return (void*)1;
5203 }
5204 
5205 
5206 static void* stbi__bmp_load(stbi__context* s, int* x, int* y, int* comp, int req_comp, stbi__result_info* ri)
5207 {
5208  stbi_uc* out;
5209  unsigned int mr = 0, mg = 0, mb = 0, ma = 0, all_a;
5210  stbi_uc pal[256][4];
5211  int psize = 0, i, j, width;
5212  int flip_vertically, pad, target;
5213  stbi__bmp_data info;
5214  STBI_NOTUSED(ri);
5215 
5216  info.all_a = 255;
5217  if (stbi__bmp_parse_header(s, &info) == NULL)
5218  return NULL; // error code already set
5219 
5220  flip_vertically = ((int)s->img_y) > 0;
5221  s->img_y = abs((int)s->img_y);
5222 
5223  mr = info.mr;
5224  mg = info.mg;
5225  mb = info.mb;
5226  ma = info.ma;
5227  all_a = info.all_a;
5228 
5229  if (info.hsz == 12) {
5230  if (info.bpp < 24)
5231  psize = (info.offset - 14 - 24) / 3;
5232  }
5233  else {
5234  if (info.bpp < 16)
5235  psize = (info.offset - 14 - info.hsz) >> 2;
5236  }
5237 
5238  s->img_n = ma ? 4 : 3;
5239  if (req_comp && req_comp >= 3) // we can directly decode 3 or 4
5240  target = req_comp;
5241  else
5242  target = s->img_n; // if they want monochrome, we'll post-convert
5243 
5244  // sanity-check size
5245  if (!stbi__mad3sizes_valid(target, s->img_x, s->img_y, 0))
5246  return stbi__errpuc("too large", "Corrupt BMP");
5247 
5248  out = (stbi_uc*)stbi__malloc_mad3(target, s->img_x, s->img_y, 0);
5249  if (!out) return stbi__errpuc("outofmem", "Out of memory");
5250  if (info.bpp < 16) {
5251  int z = 0;
5252  if (psize == 0 || psize > 256) { STBI_FREE(out); return stbi__errpuc("invalid", "Corrupt BMP"); }
5253  for (i = 0; i < psize; ++i) {
5254  pal[i][2] = stbi__get8(s);
5255  pal[i][1] = stbi__get8(s);
5256  pal[i][0] = stbi__get8(s);
5257  if (info.hsz != 12) stbi__get8(s);
5258  pal[i][3] = 255;
5259  }
5260  stbi__skip(s, info.offset - 14 - info.hsz - psize * (info.hsz == 12 ? 3 : 4));
5261  if (info.bpp == 1) width = (s->img_x + 7) >> 3;
5262  else if (info.bpp == 4) width = (s->img_x + 1) >> 1;
5263  else if (info.bpp == 8) width = s->img_x;
5264  else { STBI_FREE(out); return stbi__errpuc("bad bpp", "Corrupt BMP"); }
5265  pad = (-width) & 3;
5266  if (info.bpp == 1) {
5267  for (j = 0; j < (int)s->img_y; ++j) {
5268  int bit_offset = 7, v = stbi__get8(s);
5269  for (i = 0; i < (int)s->img_x; ++i) {
5270  int color = (v >> bit_offset) & 0x1;
5271  out[z++] = pal[color][0];
5272  out[z++] = pal[color][1];
5273  out[z++] = pal[color][2];
5274  if ((--bit_offset) < 0) {
5275  bit_offset = 7;
5276  v = stbi__get8(s);
5277  }
5278  }
5279  stbi__skip(s, pad);
5280  }
5281  }
5282  else {
5283  for (j = 0; j < (int)s->img_y; ++j) {
5284  for (i = 0; i < (int)s->img_x; i += 2) {
5285  int v = stbi__get8(s), v2 = 0;
5286  if (info.bpp == 4) {
5287  v2 = v & 15;
5288  v >>= 4;
5289  }
5290  out[z++] = pal[v][0];
5291  out[z++] = pal[v][1];
5292  out[z++] = pal[v][2];
5293  if (target == 4) out[z++] = 255;
5294  if (i + 1 == (int)s->img_x) break;
5295  v = (info.bpp == 8) ? stbi__get8(s) : v2;
5296  out[z++] = pal[v][0];
5297  out[z++] = pal[v][1];
5298  out[z++] = pal[v][2];
5299  if (target == 4) out[z++] = 255;
5300  }
5301  stbi__skip(s, pad);
5302  }
5303  }
5304  }
5305  else {
5306  int rshift = 0, gshift = 0, bshift = 0, ashift = 0, rcount = 0, gcount = 0, bcount = 0, acount = 0;
5307  int z = 0;
5308  int easy = 0;
5309  stbi__skip(s, info.offset - 14 - info.hsz);
5310  if (info.bpp == 24) width = 3 * s->img_x;
5311  else if (info.bpp == 16) width = 2 * s->img_x;
5312  else /* bpp = 32 and pad = 0 */ width = 0;
5313  pad = (-width) & 3;
5314  if (info.bpp == 24) {
5315  easy = 1;
5316  }
5317  else if (info.bpp == 32) {
5318  if (mb == 0xff && mg == 0xff00 && mr == 0x00ff0000 && ma == 0xff000000)
5319  easy = 2;
5320  }
5321  if (!easy) {
5322  if (!mr || !mg || !mb) { STBI_FREE(out); return stbi__errpuc("bad masks", "Corrupt BMP"); }
5323  // right shift amt to put high bit in position #7
5324  rshift = stbi__high_bit(mr) - 7; rcount = stbi__bitcount(mr);
5325  gshift = stbi__high_bit(mg) - 7; gcount = stbi__bitcount(mg);
5326  bshift = stbi__high_bit(mb) - 7; bcount = stbi__bitcount(mb);
5327  ashift = stbi__high_bit(ma) - 7; acount = stbi__bitcount(ma);
5328  }
5329  for (j = 0; j < (int)s->img_y; ++j) {
5330  if (easy) {
5331  for (i = 0; i < (int)s->img_x; ++i) {
5332  unsigned char a;
5333  out[z + 2] = stbi__get8(s);
5334  out[z + 1] = stbi__get8(s);
5335  out[z + 0] = stbi__get8(s);
5336  z += 3;
5337  a = (easy == 2 ? stbi__get8(s) : 255);
5338  all_a |= a;
5339  if (target == 4) out[z++] = a;
5340  }
5341  }
5342  else {
5343  int bpp = info.bpp;
5344  for (i = 0; i < (int)s->img_x; ++i) {
5345  stbi__uint32 v = (bpp == 16 ? (stbi__uint32)stbi__get16le(s) : stbi__get32le(s));
5346  unsigned int a;
5347  out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mr, rshift, rcount));
5348  out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mg, gshift, gcount));
5349  out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mb, bshift, bcount));
5350  a = (ma ? stbi__shiftsigned(v & ma, ashift, acount) : 255);
5351  all_a |= a;
5352  if (target == 4) out[z++] = STBI__BYTECAST(a);
5353  }
5354  }
5355  stbi__skip(s, pad);
5356  }
5357  }
5358 
5359  // if alpha channel is all 0s, replace with all 255s
5360  if (target == 4 && all_a == 0)
5361  for (i = 4 * s->img_x * s->img_y - 1; i >= 0; i -= 4)
5362  out[i] = 255;
5363 
5364  if (flip_vertically) {
5365  stbi_uc t;
5366  for (j = 0; j < (int)s->img_y >> 1; ++j) {
5367  stbi_uc* p1 = out + j * s->img_x * target;
5368  stbi_uc* p2 = out + (s->img_y - 1 - j) * s->img_x * target;
5369  for (i = 0; i < (int)s->img_x * target; ++i) {
5370  t = p1[i], p1[i] = p2[i], p2[i] = t;
5371  }
5372  }
5373  }
5374 
5375  if (req_comp && req_comp != target) {
5376  out = stbi__convert_format(out, target, req_comp, s->img_x, s->img_y);
5377  if (out == NULL) return out; // stbi__convert_format frees input on failure
5378  }
5379 
5380  *x = s->img_x;
5381  *y = s->img_y;
5382  if (comp)* comp = s->img_n;
5383  return out;
5384 }
5385 #endif
5386 
5387 // Targa Truevision - TGA
5388 // by Jonathan Dummer
5389 #ifndef STBI_NO_TGA
5390 // returns STBI_rgb or whatever, 0 on error
5391 static int stbi__tga_get_comp(int bits_per_pixel, int is_grey, int* is_rgb16)
5392 {
5393  // only RGB or RGBA (incl. 16bit) or grey allowed
5394  if (is_rgb16)* is_rgb16 = 0;
5395  switch (bits_per_pixel) {
5396  case 8: return STBI_grey;
5397  case 16: if (is_grey) return STBI_grey_alpha;
5398  // fallthrough
5399  case 15: if (is_rgb16) * is_rgb16 = 1;
5400  return STBI_rgb;
5401  case 24: // fallthrough
5402  case 32: return bits_per_pixel / 8;
5403  default: return 0;
5404  }
5405 }
5406 
5407 static int stbi__tga_info(stbi__context* s, int* x, int* y, int* comp)
5408 {
5409  int tga_w, tga_h, tga_comp, tga_image_type, tga_bits_per_pixel, tga_colormap_bpp;
5410  int sz, tga_colormap_type;
5411  stbi__get8(s); // discard Offset
5412  tga_colormap_type = stbi__get8(s); // colormap type
5413  if (tga_colormap_type > 1) {
5414  stbi__rewind(s);
5415  return 0; // only RGB or indexed allowed
5416  }
5417  tga_image_type = stbi__get8(s); // image type
5418  if (tga_colormap_type == 1) { // colormapped (paletted) image
5419  if (tga_image_type != 1 && tga_image_type != 9) {
5420  stbi__rewind(s);
5421  return 0;
5422  }
5423  stbi__skip(s, 4); // skip index of first colormap entry and number of entries
5424  sz = stbi__get8(s); // check bits per palette color entry
5425  if ((sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32)) {
5426  stbi__rewind(s);
5427  return 0;
5428  }
5429  stbi__skip(s, 4); // skip image x and y origin
5430  tga_colormap_bpp = sz;
5431  }
5432  else { // "normal" image w/o colormap - only RGB or grey allowed, +/- RLE
5433  if ((tga_image_type != 2) && (tga_image_type != 3) && (tga_image_type != 10) && (tga_image_type != 11)) {
5434  stbi__rewind(s);
5435  return 0; // only RGB or grey allowed, +/- RLE
5436  }
5437  stbi__skip(s, 9); // skip colormap specification and image x/y origin
5438  tga_colormap_bpp = 0;
5439  }
5440  tga_w = stbi__get16le(s);
5441  if (tga_w < 1) {
5442  stbi__rewind(s);
5443  return 0; // test width
5444  }
5445  tga_h = stbi__get16le(s);
5446  if (tga_h < 1) {
5447  stbi__rewind(s);
5448  return 0; // test height
5449  }
5450  tga_bits_per_pixel = stbi__get8(s); // bits per pixel
5451  stbi__get8(s); // ignore alpha bits
5452  if (tga_colormap_bpp != 0) {
5453  if ((tga_bits_per_pixel != 8) && (tga_bits_per_pixel != 16)) {
5454  // when using a colormap, tga_bits_per_pixel is the size of the indexes
5455  // I don't think anything but 8 or 16bit indexes makes sense
5456  stbi__rewind(s);
5457  return 0;
5458  }
5459  tga_comp = stbi__tga_get_comp(tga_colormap_bpp, 0, NULL);
5460  }
5461  else {
5462  tga_comp = stbi__tga_get_comp(tga_bits_per_pixel, (tga_image_type == 3) || (tga_image_type == 11), NULL);
5463  }
5464  if (!tga_comp) {
5465  stbi__rewind(s);
5466  return 0;
5467  }
5468  if (x)* x = tga_w;
5469  if (y)* y = tga_h;
5470  if (comp)* comp = tga_comp;
5471  return 1; // seems to have passed everything
5472 }
5473 
5474 static int stbi__tga_test(stbi__context* s)
5475 {
5476  int res = 0;
5477  int sz, tga_color_type;
5478  stbi__get8(s); // discard Offset
5479  tga_color_type = stbi__get8(s); // color type
5480  if (tga_color_type > 1) goto errorEnd; // only RGB or indexed allowed
5481  sz = stbi__get8(s); // image type
5482  if (tga_color_type == 1) { // colormapped (paletted) image
5483  if (sz != 1 && sz != 9) goto errorEnd; // colortype 1 demands image type 1 or 9
5484  stbi__skip(s, 4); // skip index of first colormap entry and number of entries
5485  sz = stbi__get8(s); // check bits per palette color entry
5486  if ((sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32)) goto errorEnd;
5487  stbi__skip(s, 4); // skip image x and y origin
5488  }
5489  else { // "normal" image w/o colormap
5490  if ((sz != 2) && (sz != 3) && (sz != 10) && (sz != 11)) goto errorEnd; // only RGB or grey allowed, +/- RLE
5491  stbi__skip(s, 9); // skip colormap specification and image x/y origin
5492  }
5493  if (stbi__get16le(s) < 1) goto errorEnd; // test width
5494  if (stbi__get16le(s) < 1) goto errorEnd; // test height
5495  sz = stbi__get8(s); // bits per pixel
5496  if ((tga_color_type == 1) && (sz != 8) && (sz != 16)) goto errorEnd; // for colormapped images, bpp is size of an index
5497  if ((sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32)) goto errorEnd;
5498 
5499  res = 1; // if we got this far, everything's good and we can return 1 instead of 0
5500 
5501 errorEnd:
5502  stbi__rewind(s);
5503  return res;
5504 }
5505 
5506 // read 16bit value and convert to 24bit RGB
5507 static void stbi__tga_read_rgb16(stbi__context* s, stbi_uc* out)
5508 {
5509  stbi__uint16 px = (stbi__uint16)stbi__get16le(s);
5510  stbi__uint16 fiveBitMask = 31;
5511  // we have 3 channels with 5bits each
5512  int r = (px >> 10) & fiveBitMask;
5513  int g = (px >> 5) & fiveBitMask;
5514  int b = px & fiveBitMask;
5515  // Note that this saves the data in RGB(A) order, so it doesn't need to be swapped later
5516  out[0] = (stbi_uc)((r * 255) / 31);
5517  out[1] = (stbi_uc)((g * 255) / 31);
5518  out[2] = (stbi_uc)((b * 255) / 31);
5519 
5520  // some people claim that the most significant bit might be used for alpha
5521  // (possibly if an alpha-bit is set in the "image descriptor byte")
5522  // but that only made 16bit test images completely translucent..
5523  // so let's treat all 15 and 16bit TGAs as RGB with no alpha.
5524 }
5525 
5526 static void* stbi__tga_load(stbi__context* s, int* x, int* y, int* comp, int req_comp, stbi__result_info* ri)
5527 {
5528  // read in the TGA header stuff
5529  int tga_offset = stbi__get8(s);
5530  int tga_indexed = stbi__get8(s);
5531  int tga_image_type = stbi__get8(s);
5532  int tga_is_RLE = 0;
5533  int tga_palette_start = stbi__get16le(s);
5534  int tga_palette_len = stbi__get16le(s);
5535  int tga_palette_bits = stbi__get8(s);
5536  int tga_x_origin = stbi__get16le(s);
5537  int tga_y_origin = stbi__get16le(s);
5538  int tga_width = stbi__get16le(s);
5539  int tga_height = stbi__get16le(s);
5540  int tga_bits_per_pixel = stbi__get8(s);
5541  int tga_comp, tga_rgb16 = 0;
5542  int tga_inverted = stbi__get8(s);
5543  // int tga_alpha_bits = tga_inverted & 15; // the 4 lowest bits - unused (useless?)
5544  // image data
5545  unsigned char* tga_data;
5546  unsigned char* tga_palette = NULL;
5547  int i, j;
5548  unsigned char raw_data[4] = { 0 };
5549  int RLE_count = 0;
5550  int RLE_repeating = 0;
5551  int read_next_pixel = 1;
5552  STBI_NOTUSED(ri);
5553 
5554  // do a tiny bit of precessing
5555  if (tga_image_type >= 8)
5556  {
5557  tga_image_type -= 8;
5558  tga_is_RLE = 1;
5559  }
5560  tga_inverted = 1 - ((tga_inverted >> 5) & 1);
5561 
5562  // If I'm paletted, then I'll use the number of bits from the palette
5563  if (tga_indexed) tga_comp = stbi__tga_get_comp(tga_palette_bits, 0, &tga_rgb16);
5564  else tga_comp = stbi__tga_get_comp(tga_bits_per_pixel, (tga_image_type == 3), &tga_rgb16);
5565 
5566  if (!tga_comp) // shouldn't really happen, stbi__tga_test() should have ensured basic consistency
5567  return stbi__errpuc("bad format", "Can't find out TGA pixelformat");
5568 
5569  // tga info
5570  *x = tga_width;
5571  *y = tga_height;
5572  if (comp)* comp = tga_comp;
5573 
5574  if (!stbi__mad3sizes_valid(tga_width, tga_height, tga_comp, 0))
5575  return stbi__errpuc("too large", "Corrupt TGA");
5576 
5577  tga_data = (unsigned char*)stbi__malloc_mad3(tga_width, tga_height, tga_comp, 0);
5578  if (!tga_data) return stbi__errpuc("outofmem", "Out of memory");
5579 
5580  // skip to the data's starting position (offset usually = 0)
5581  stbi__skip(s, tga_offset);
5582 
5583  if (!tga_indexed && !tga_is_RLE && !tga_rgb16) {
5584  for (i = 0; i < tga_height; ++i) {
5585  int row = tga_inverted ? tga_height - i - 1 : i;
5586  stbi_uc* tga_row = tga_data + row * tga_width * tga_comp;
5587  stbi__getn(s, tga_row, tga_width * tga_comp);
5588  }
5589  }
5590  else {
5591  // do I need to load a palette?
5592  if (tga_indexed)
5593  {
5594  // any data to skip? (offset usually = 0)
5595  stbi__skip(s, tga_palette_start);
5596  // load the palette
5597  tga_palette = (unsigned char*)stbi__malloc_mad2(tga_palette_len, tga_comp, 0);
5598  if (!tga_palette) {
5599  STBI_FREE(tga_data);
5600  return stbi__errpuc("outofmem", "Out of memory");
5601  }
5602  if (tga_rgb16) {
5603  stbi_uc* pal_entry = tga_palette;
5604  STBI_ASSERT(tga_comp == STBI_rgb);
5605  for (i = 0; i < tga_palette_len; ++i) {
5606  stbi__tga_read_rgb16(s, pal_entry);
5607  pal_entry += tga_comp;
5608  }
5609  }
5610  else if (!stbi__getn(s, tga_palette, tga_palette_len * tga_comp)) {
5611  STBI_FREE(tga_data);
5612  STBI_FREE(tga_palette);
5613  return stbi__errpuc("bad palette", "Corrupt TGA");
5614  }
5615  }
5616  // load the data
5617  for (i = 0; i < tga_width * tga_height; ++i)
5618  {
5619  // if I'm in RLE mode, do I need to get a RLE stbi__pngchunk?
5620  if (tga_is_RLE)
5621  {
5622  if (RLE_count == 0)
5623  {
5624  // yep, get the next byte as a RLE command
5625  int RLE_cmd = stbi__get8(s);
5626  RLE_count = 1 + (RLE_cmd & 127);
5627  RLE_repeating = RLE_cmd >> 7;
5628  read_next_pixel = 1;
5629  }
5630  else if (!RLE_repeating)
5631  {
5632  read_next_pixel = 1;
5633  }
5634  }
5635  else
5636  {
5637  read_next_pixel = 1;
5638  }
5639  // OK, if I need to read a pixel, do it now
5640  if (read_next_pixel)
5641  {
5642  // load however much data we did have
5643  if (tga_indexed)
5644  {
5645  // read in index, then perform the lookup
5646  int pal_idx = (tga_bits_per_pixel == 8) ? stbi__get8(s) : stbi__get16le(s);
5647  if (pal_idx >= tga_palette_len) {
5648  // invalid index
5649  pal_idx = 0;
5650  }
5651  pal_idx *= tga_comp;
5652  for (j = 0; j < tga_comp; ++j) {
5653  raw_data[j] = tga_palette[pal_idx + j];
5654  }
5655  }
5656  else if (tga_rgb16) {
5657  STBI_ASSERT(tga_comp == STBI_rgb);
5658  stbi__tga_read_rgb16(s, raw_data);
5659  }
5660  else {
5661  // read in the data raw
5662  for (j = 0; j < tga_comp; ++j) {
5663  raw_data[j] = stbi__get8(s);
5664  }
5665  }
5666  // clear the reading flag for the next pixel
5667  read_next_pixel = 0;
5668  } // end of reading a pixel
5669 
5670  // copy data
5671  for (j = 0; j < tga_comp; ++j)
5672  tga_data[i * tga_comp + j] = raw_data[j];
5673 
5674  // in case we're in RLE mode, keep counting down
5675  --RLE_count;
5676  }
5677  // do I need to invert the image?
5678  if (tga_inverted)
5679  {
5680  for (j = 0; j * 2 < tga_height; ++j)
5681  {
5682  int index1 = j * tga_width * tga_comp;
5683  int index2 = (tga_height - 1 - j) * tga_width * tga_comp;
5684  for (i = tga_width * tga_comp; i > 0; --i)
5685  {
5686  unsigned char temp = tga_data[index1];
5687  tga_data[index1] = tga_data[index2];
5688  tga_data[index2] = temp;
5689  ++index1;
5690  ++index2;
5691  }
5692  }
5693  }
5694  // clear my palette, if I had one
5695  if (tga_palette != NULL)
5696  {
5697  STBI_FREE(tga_palette);
5698  }
5699  }
5700 
5701  // swap RGB - if the source data was RGB16, it already is in the right order
5702  if (tga_comp >= 3 && !tga_rgb16)
5703  {
5704  unsigned char* tga_pixel = tga_data;
5705  for (i = 0; i < tga_width * tga_height; ++i)
5706  {
5707  unsigned char temp = tga_pixel[0];
5708  tga_pixel[0] = tga_pixel[2];
5709  tga_pixel[2] = temp;
5710  tga_pixel += tga_comp;
5711  }
5712  }
5713 
5714  // convert to target component count
5715  if (req_comp && req_comp != tga_comp)
5716  tga_data = stbi__convert_format(tga_data, tga_comp, req_comp, tga_width, tga_height);
5717 
5718  // the things I do to get rid of an error message, and yet keep
5719  // Microsoft's C compilers happy... [8^(
5720  tga_palette_start = tga_palette_len = tga_palette_bits =
5721  tga_x_origin = tga_y_origin = 0;
5722  // OK, done
5723  return tga_data;
5724 }
5725 #endif
5726 
5727 // *************************************************************************************************
5728 // Photoshop PSD loader -- PD by Thatcher Ulrich, integration by Nicolas Schulz, tweaked by STB
5729 
5730 #ifndef STBI_NO_PSD
5731 static int stbi__psd_test(stbi__context* s)
5732 {
5733  int r = (stbi__get32be(s) == 0x38425053);
5734  stbi__rewind(s);
5735  return r;
5736 }
5737 
5738 static int stbi__psd_decode_rle(stbi__context* s, stbi_uc* p, int pixelCount)
5739 {
5740  int count, nleft, len;
5741 
5742  count = 0;
5743  while ((nleft = pixelCount - count) > 0) {
5744  len = stbi__get8(s);
5745  if (len == 128) {
5746  // No-op.
5747  }
5748  else if (len < 128) {
5749  // Copy next len+1 bytes literally.
5750  len++;
5751  if (len > nleft) return 0; // corrupt data
5752  count += len;
5753  while (len) {
5754  *p = stbi__get8(s);
5755  p += 4;
5756  len--;
5757  }
5758  }
5759  else if (len > 128) {
5760  stbi_uc val;
5761  // Next -len+1 bytes in the dest are replicated from next source byte.
5762  // (Interpret len as a negative 8-bit int.)
5763  len = 257 - len;
5764  if (len > nleft) return 0; // corrupt data
5765  val = stbi__get8(s);
5766  count += len;
5767  while (len) {
5768  *p = val;
5769  p += 4;
5770  len--;
5771  }
5772  }
5773  }
5774 
5775  return 1;
5776 }
5777 
5778 static void* stbi__psd_load(stbi__context* s, int* x, int* y, int* comp, int req_comp, stbi__result_info* ri, int bpc)
5779 {
5780  int pixelCount;
5781  int channelCount, compression;
5782  int channel, i;
5783  int bitdepth;
5784  int w, h;
5785  stbi_uc* out;
5786  STBI_NOTUSED(ri);
5787 
5788  // Check identifier
5789  if (stbi__get32be(s) != 0x38425053) // "8BPS"
5790  return stbi__errpuc("not PSD", "Corrupt PSD image");
5791 
5792  // Check file type version.
5793  if (stbi__get16be(s) != 1)
5794  return stbi__errpuc("wrong version", "Unsupported version of PSD image");
5795 
5796  // Skip 6 reserved bytes.
5797  stbi__skip(s, 6);
5798 
5799  // Read the number of channels (R, G, B, A, etc).
5800  channelCount = stbi__get16be(s);
5801  if (channelCount < 0 || channelCount > 16)
5802  return stbi__errpuc("wrong channel count", "Unsupported number of channels in PSD image");
5803 
5804  // Read the rows and columns of the image.
5805  h = stbi__get32be(s);
5806  w = stbi__get32be(s);
5807 
5808  // Make sure the depth is 8 bits.
5809  bitdepth = stbi__get16be(s);
5810  if (bitdepth != 8 && bitdepth != 16)
5811  return stbi__errpuc("unsupported bit depth", "PSD bit depth is not 8 or 16 bit");
5812 
5813  // Make sure the color mode is RGB.
5814  // Valid options are:
5815  // 0: Bitmap
5816  // 1: Grayscale
5817  // 2: Indexed color
5818  // 3: RGB color
5819  // 4: CMYK color
5820  // 7: Multichannel
5821  // 8: Duotone
5822  // 9: Lab color
5823  if (stbi__get16be(s) != 3)
5824  return stbi__errpuc("wrong color format", "PSD is not in RGB color format");
5825 
5826  // Skip the Mode Data. (It's the palette for indexed color; other info for other modes.)
5827  stbi__skip(s, stbi__get32be(s));
5828 
5829  // Skip the image resources. (resolution, pen tool paths, etc)
5830  stbi__skip(s, stbi__get32be(s));
5831 
5832  // Skip the reserved data.
5833  stbi__skip(s, stbi__get32be(s));
5834 
5835  // Find out if the data is compressed.
5836  // Known values:
5837  // 0: no compression
5838  // 1: RLE compressed
5839  compression = stbi__get16be(s);
5840  if (compression > 1)
5841  return stbi__errpuc("bad compression", "PSD has an unknown compression format");
5842 
5843  // Check size
5844  if (!stbi__mad3sizes_valid(4, w, h, 0))
5845  return stbi__errpuc("too large", "Corrupt PSD");
5846 
5847  // Create the destination image.
5848 
5849  if (!compression && bitdepth == 16 && bpc == 16) {
5850  out = (stbi_uc*)stbi__malloc_mad3(8, w, h, 0);
5851  ri->bits_per_channel = 16;
5852  }
5853  else
5854  out = (stbi_uc*)stbi__malloc(4 * w * h);
5855 
5856  if (!out) return stbi__errpuc("outofmem", "Out of memory");
5857  pixelCount = w * h;
5858 
5859  // Initialize the data to zero.
5860  //memset( out, 0, pixelCount * 4 );
5861 
5862  // Finally, the image data.
5863  if (compression) {
5864  // RLE as used by .PSD and .TIFF
5865  // Loop until you get the number of unpacked bytes you are expecting:
5866  // Read the next source byte into n.
5867  // If n is between 0 and 127 inclusive, copy the next n+1 bytes literally.
5868  // Else if n is between -127 and -1 inclusive, copy the next byte -n+1 times.
5869  // Else if n is 128, noop.
5870  // Endloop
5871 
5872  // The RLE-compressed data is preceeded by a 2-byte data count for each row in the data,
5873  // which we're going to just skip.
5874  stbi__skip(s, h * channelCount * 2);
5875 
5876  // Read the RLE data by channel.
5877  for (channel = 0; channel < 4; channel++) {
5878  stbi_uc* p;
5879 
5880  p = out + channel;
5881  if (channel >= channelCount) {
5882  // Fill this channel with default data.
5883  for (i = 0; i < pixelCount; i++, p += 4)
5884  * p = (channel == 3 ? 255 : 0);
5885  }
5886  else {
5887  // Read the RLE data.
5888  if (!stbi__psd_decode_rle(s, p, pixelCount)) {
5889  STBI_FREE(out);
5890  return stbi__errpuc("corrupt", "bad RLE data");
5891  }
5892  }
5893  }
5894 
5895  }
5896  else {
5897  // We're at the raw image data. It's each channel in order (Red, Green, Blue, Alpha, ...)
5898  // where each channel consists of an 8-bit (or 16-bit) value for each pixel in the image.
5899 
5900  // Read the data by channel.
5901  for (channel = 0; channel < 4; channel++) {
5902  if (channel >= channelCount) {
5903  // Fill this channel with default data.
5904  if (bitdepth == 16 && bpc == 16) {
5905  stbi__uint16* q = ((stbi__uint16*)out) + channel;
5906  stbi__uint16 val = channel == 3 ? 65535 : 0;
5907  for (i = 0; i < pixelCount; i++, q += 4)
5908  * q = val;
5909  }
5910  else {
5911  stbi_uc* p = out + channel;
5912  stbi_uc val = channel == 3 ? 255 : 0;
5913  for (i = 0; i < pixelCount; i++, p += 4)
5914  * p = val;
5915  }
5916  }
5917  else {
5918  if (ri->bits_per_channel == 16) { // output bpc
5919  stbi__uint16* q = ((stbi__uint16*)out) + channel;
5920  for (i = 0; i < pixelCount; i++, q += 4)
5921  * q = (stbi__uint16)stbi__get16be(s);
5922  }
5923  else {
5924  stbi_uc* p = out + channel;
5925  if (bitdepth == 16) { // input bpc
5926  for (i = 0; i < pixelCount; i++, p += 4)
5927  * p = (stbi_uc)(stbi__get16be(s) >> 8);
5928  }
5929  else {
5930  for (i = 0; i < pixelCount; i++, p += 4)
5931  * p = stbi__get8(s);
5932  }
5933  }
5934  }
5935  }
5936  }
5937 
5938  // remove weird white matte from PSD
5939  if (channelCount >= 4) {
5940  if (ri->bits_per_channel == 16) {
5941  for (i = 0; i < w * h; ++i) {
5942  stbi__uint16* pixel = (stbi__uint16*)out + 4 * i;
5943  if (pixel[3] != 0 && pixel[3] != 65535) {
5944  float a = pixel[3] / 65535.0f;
5945  float ra = 1.0f / a;
5946  float inv_a = 65535.0f * (1 - ra);
5947  pixel[0] = (stbi__uint16)(pixel[0] * ra + inv_a);
5948  pixel[1] = (stbi__uint16)(pixel[1] * ra + inv_a);
5949  pixel[2] = (stbi__uint16)(pixel[2] * ra + inv_a);
5950  }
5951  }
5952  }
5953  else {
5954  for (i = 0; i < w * h; ++i) {
5955  unsigned char* pixel = out + 4 * i;
5956  if (pixel[3] != 0 && pixel[3] != 255) {
5957  float a = pixel[3] / 255.0f;
5958  float ra = 1.0f / a;
5959  float inv_a = 255.0f * (1 - ra);
5960  pixel[0] = (unsigned char)(pixel[0] * ra + inv_a);
5961  pixel[1] = (unsigned char)(pixel[1] * ra + inv_a);
5962  pixel[2] = (unsigned char)(pixel[2] * ra + inv_a);
5963  }
5964  }
5965  }
5966  }
5967 
5968  // convert to desired output format
5969  if (req_comp && req_comp != 4) {
5970  if (ri->bits_per_channel == 16)
5971  out = (stbi_uc*)stbi__convert_format16((stbi__uint16*)out, 4, req_comp, w, h);
5972  else
5973  out = stbi__convert_format(out, 4, req_comp, w, h);
5974  if (out == NULL) return out; // stbi__convert_format frees input on failure
5975  }
5976 
5977  if (comp)* comp = 4;
5978  *y = h;
5979  *x = w;
5980 
5981  return out;
5982 }
5983 #endif
5984 
5985 // *************************************************************************************************
5986 // Softimage PIC loader
5987 // by Tom Seddon
5988 //
5989 // See http://softimage.wiki.softimage.com/index.php/INFO:_PIC_file_format
5990 // See http://ozviz.wasp.uwa.edu.au/~pbourke/dataformats/softimagepic/
5991 
5992 #ifndef STBI_NO_PIC
5993 static int stbi__pic_is4(stbi__context* s, const char* str)
5994 {
5995  int i;
5996  for (i = 0; i < 4; ++i)
5997  if (stbi__get8(s) != (stbi_uc)str[i])
5998  return 0;
5999 
6000  return 1;
6001 }
6002 
6003 static int stbi__pic_test_core(stbi__context* s)
6004 {
6005  int i;
6006 
6007  if (!stbi__pic_is4(s, "\x53\x80\xF6\x34"))
6008  return 0;
6009 
6010  for (i = 0; i < 84; ++i)
6011  stbi__get8(s);
6012 
6013  if (!stbi__pic_is4(s, "PICT"))
6014  return 0;
6015 
6016  return 1;
6017 }
6018 
6019 typedef struct
6020 {
6021  stbi_uc size, type, channel;
6022 } stbi__pic_packet;
6023 
6024 static stbi_uc* stbi__readval(stbi__context* s, int channel, stbi_uc* dest)
6025 {
6026  int mask = 0x80, i;
6027 
6028  for (i = 0; i < 4; ++i, mask >>= 1) {
6029  if (channel & mask) {
6030  if (stbi__at_eof(s)) return stbi__errpuc("bad file", "PIC file too short");
6031  dest[i] = stbi__get8(s);
6032  }
6033  }
6034 
6035  return dest;
6036 }
6037 
6038 static void stbi__copyval(int channel, stbi_uc* dest, const stbi_uc* src)
6039 {
6040  int mask = 0x80, i;
6041 
6042  for (i = 0; i < 4; ++i, mask >>= 1)
6043  if (channel & mask)
6044  dest[i] = src[i];
6045 }
6046 
6047 static stbi_uc* stbi__pic_load_core(stbi__context* s, int width, int height, int* comp, stbi_uc* result)
6048 {
6049  int act_comp = 0, num_packets = 0, y, chained;
6050  stbi__pic_packet packets[10];
6051 
6052  // this will (should...) cater for even some bizarre stuff like having data
6053  // for the same channel in multiple packets.
6054  do {
6055  stbi__pic_packet* packet;
6056 
6057  if (num_packets == sizeof(packets) / sizeof(packets[0]))
6058  return stbi__errpuc("bad format", "too many packets");
6059 
6060  packet = &packets[num_packets++];
6061 
6062  chained = stbi__get8(s);
6063  packet->size = stbi__get8(s);
6064  packet->type = stbi__get8(s);
6065  packet->channel = stbi__get8(s);
6066 
6067  act_comp |= packet->channel;
6068 
6069  if (stbi__at_eof(s)) return stbi__errpuc("bad file", "file too short (reading packets)");
6070  if (packet->size != 8) return stbi__errpuc("bad format", "packet isn't 8bpp");
6071  } while (chained);
6072 
6073  *comp = (act_comp & 0x10 ? 4 : 3); // has alpha channel?
6074 
6075  for (y = 0; y < height; ++y) {
6076  int packet_idx;
6077 
6078  for (packet_idx = 0; packet_idx < num_packets; ++packet_idx) {
6079  stbi__pic_packet* packet = &packets[packet_idx];
6080  stbi_uc* dest = result + y * width * 4;
6081 
6082  switch (packet->type) {
6083  default:
6084  return stbi__errpuc("bad format", "packet has bad compression type");
6085 
6086  case 0: {//uncompressed
6087  int x;
6088 
6089  for (x = 0; x < width; ++x, dest += 4)
6090  if (!stbi__readval(s, packet->channel, dest))
6091  return 0;
6092  break;
6093  }
6094 
6095  case 1://Pure RLE
6096  {
6097  int left = width, i;
6098 
6099  while (left > 0) {
6100  stbi_uc count, value[4];
6101 
6102  count = stbi__get8(s);
6103  if (stbi__at_eof(s)) return stbi__errpuc("bad file", "file too short (pure read count)");
6104 
6105  if (count > left)
6106  count = (stbi_uc)left;
6107 
6108  if (!stbi__readval(s, packet->channel, value)) return 0;
6109 
6110  for (i = 0; i < count; ++i, dest += 4)
6111  stbi__copyval(packet->channel, dest, value);
6112  left -= count;
6113  }
6114  }
6115  break;
6116 
6117  case 2: {//Mixed RLE
6118  int left = width;
6119  while (left > 0) {
6120  int count = stbi__get8(s), i;
6121  if (stbi__at_eof(s)) return stbi__errpuc("bad file", "file too short (mixed read count)");
6122 
6123  if (count >= 128) { // Repeated
6124  stbi_uc value[4];
6125 
6126  if (count == 128)
6127  count = stbi__get16be(s);
6128  else
6129  count -= 127;
6130  if (count > left)
6131  return stbi__errpuc("bad file", "scanline overrun");
6132 
6133  if (!stbi__readval(s, packet->channel, value))
6134  return 0;
6135 
6136  for (i = 0; i < count; ++i, dest += 4)
6137  stbi__copyval(packet->channel, dest, value);
6138  }
6139  else { // Raw
6140  ++count;
6141  if (count > left) return stbi__errpuc("bad file", "scanline overrun");
6142 
6143  for (i = 0; i < count; ++i, dest += 4)
6144  if (!stbi__readval(s, packet->channel, dest))
6145  return 0;
6146  }
6147  left -= count;
6148  }
6149  break;
6150  }
6151  }
6152  }
6153  }
6154 
6155  return result;
6156 }
6157 
6158 static void* stbi__pic_load(stbi__context* s, int* px, int* py, int* comp, int req_comp, stbi__result_info* ri)
6159 {
6160  stbi_uc* result;
6161  int i, x, y, internal_comp;
6162  STBI_NOTUSED(ri);
6163 
6164  if (!comp) comp = &internal_comp;
6165 
6166  for (i = 0; i < 92; ++i)
6167  stbi__get8(s);
6168 
6169  x = stbi__get16be(s);
6170  y = stbi__get16be(s);
6171  if (stbi__at_eof(s)) return stbi__errpuc("bad file", "file too short (pic header)");
6172  if (!stbi__mad3sizes_valid(x, y, 4, 0)) return stbi__errpuc("too large", "PIC image too large to decode");
6173 
6174  stbi__get32be(s); //skip `ratio'
6175  stbi__get16be(s); //skip `fields'
6176  stbi__get16be(s); //skip `pad'
6177 
6178  // intermediate buffer is RGBA
6179  result = (stbi_uc*)stbi__malloc_mad3(x, y, 4, 0);
6180  memset(result, 0xff, x * y * 4);
6181 
6182  if (!stbi__pic_load_core(s, x, y, comp, result)) {
6183  STBI_FREE(result);
6184  result = 0;
6185  }
6186  *px = x;
6187  *py = y;
6188  if (req_comp == 0) req_comp = *comp;
6189  result = stbi__convert_format(result, 4, req_comp, x, y);
6190 
6191  return result;
6192 }
6193 
6194 static int stbi__pic_test(stbi__context* s)
6195 {
6196  int r = stbi__pic_test_core(s);
6197  stbi__rewind(s);
6198  return r;
6199 }
6200 #endif
6201 
6202 // *************************************************************************************************
6203 // GIF loader -- public domain by Jean-Marc Lienher -- simplified/shrunk by stb
6204 
6205 #ifndef STBI_NO_GIF
6206 typedef struct
6207 {
6208  stbi__int16 prefix;
6209  stbi_uc first;
6210  stbi_uc suffix;
6211 } stbi__gif_lzw;
6212 
6213 typedef struct
6214 {
6215  int w, h;
6216  stbi_uc* out; // output buffer (always 4 components)
6217  stbi_uc* background; // The current "background" as far as a gif is concerned
6218  stbi_uc* history;
6219  int flags, bgindex, ratio, transparent, eflags;
6220  stbi_uc pal[256][4];
6221  stbi_uc lpal[256][4];
6222  stbi__gif_lzw codes[8192];
6223  stbi_uc* color_table;
6224  int parse, step;
6225  int lflags;
6226  int start_x, start_y;
6227  int max_x, max_y;
6228  int cur_x, cur_y;
6229  int line_size;
6230  int delay;
6231 } stbi__gif;
6232 
6233 static int stbi__gif_test_raw(stbi__context* s)
6234 {
6235  int sz;
6236  if (stbi__get8(s) != 'G' || stbi__get8(s) != 'I' || stbi__get8(s) != 'F' || stbi__get8(s) != '8') return 0;
6237  sz = stbi__get8(s);
6238  if (sz != '9' && sz != '7') return 0;
6239  if (stbi__get8(s) != 'a') return 0;
6240  return 1;
6241 }
6242 
6243 static int stbi__gif_test(stbi__context* s)
6244 {
6245  int r = stbi__gif_test_raw(s);
6246  stbi__rewind(s);
6247  return r;
6248 }
6249 
6250 static void stbi__gif_parse_colortable(stbi__context* s, stbi_uc pal[256][4], int num_entries, int transp)
6251 {
6252  int i;
6253  for (i = 0; i < num_entries; ++i) {
6254  pal[i][2] = stbi__get8(s);
6255  pal[i][1] = stbi__get8(s);
6256  pal[i][0] = stbi__get8(s);
6257  pal[i][3] = transp == i ? 0 : 255;
6258  }
6259 }
6260 
6261 static int stbi__gif_header(stbi__context* s, stbi__gif* g, int* comp, int is_info)
6262 {
6263  stbi_uc version;
6264  if (stbi__get8(s) != 'G' || stbi__get8(s) != 'I' || stbi__get8(s) != 'F' || stbi__get8(s) != '8')
6265  return stbi__err("not GIF", "Corrupt GIF");
6266 
6267  version = stbi__get8(s);
6268  if (version != '7' && version != '9') return stbi__err("not GIF", "Corrupt GIF");
6269  if (stbi__get8(s) != 'a') return stbi__err("not GIF", "Corrupt GIF");
6270 
6271  stbi__g_failure_reason = "";
6272  g->w = stbi__get16le(s);
6273  g->h = stbi__get16le(s);
6274  g->flags = stbi__get8(s);
6275  g->bgindex = stbi__get8(s);
6276  g->ratio = stbi__get8(s);
6277  g->transparent = -1;
6278 
6279  if (comp != 0)* comp = 4; // can't actually tell whether it's 3 or 4 until we parse the comments
6280 
6281  if (is_info) return 1;
6282 
6283  if (g->flags & 0x80)
6284  stbi__gif_parse_colortable(s, g->pal, 2 << (g->flags & 7), -1);
6285 
6286  return 1;
6287 }
6288 
6289 static int stbi__gif_info_raw(stbi__context* s, int* x, int* y, int* comp)
6290 {
6291  stbi__gif* g = (stbi__gif*)stbi__malloc(sizeof(stbi__gif));
6292  if (!stbi__gif_header(s, g, comp, 1)) {
6293  STBI_FREE(g);
6294  stbi__rewind(s);
6295  return 0;
6296  }
6297  if (x)* x = g->w;
6298  if (y)* y = g->h;
6299  STBI_FREE(g);
6300  return 1;
6301 }
6302 
6303 static void stbi__out_gif_code(stbi__gif* g, stbi__uint16 code)
6304 {
6305  stbi_uc* p, * c;
6306  int idx;
6307 
6308  // recurse to decode the prefixes, since the linked-list is backwards,
6309  // and working backwards through an interleaved image would be nasty
6310  if (g->codes[code].prefix >= 0)
6311  stbi__out_gif_code(g, g->codes[code].prefix);
6312 
6313  if (g->cur_y >= g->max_y) return;
6314 
6315  idx = g->cur_x + g->cur_y;
6316  p = &g->out[idx];
6317  g->history[idx / 4] = 1;
6318 
6319  c = &g->color_table[g->codes[code].suffix * 4];
6320  if (c[3] > 128) { // don't render transparent pixels;
6321  p[0] = c[2];
6322  p[1] = c[1];
6323  p[2] = c[0];
6324  p[3] = c[3];
6325  }
6326  g->cur_x += 4;
6327 
6328  if (g->cur_x >= g->max_x) {
6329  g->cur_x = g->start_x;
6330  g->cur_y += g->step;
6331 
6332  while (g->cur_y >= g->max_y && g->parse > 0) {
6333  g->step = (1 << g->parse) * g->line_size;
6334  g->cur_y = g->start_y + (g->step >> 1);
6335  --g->parse;
6336  }
6337  }
6338 }
6339 
6340 static stbi_uc* stbi__process_gif_raster(stbi__context* s, stbi__gif* g)
6341 {
6342  stbi_uc lzw_cs;
6343  stbi__int32 len, init_code;
6344  stbi__uint32 first;
6345  stbi__int32 codesize, codemask, avail, oldcode, bits, valid_bits, clear;
6346  stbi__gif_lzw* p;
6347 
6348  lzw_cs = stbi__get8(s);
6349  if (lzw_cs > 12) return NULL;
6350  clear = 1 << lzw_cs;
6351  first = 1;
6352  codesize = lzw_cs + 1;
6353  codemask = (1 << codesize) - 1;
6354  bits = 0;
6355  valid_bits = 0;
6356  for (init_code = 0; init_code < clear; init_code++) {
6357  g->codes[init_code].prefix = -1;
6358  g->codes[init_code].first = (stbi_uc)init_code;
6359  g->codes[init_code].suffix = (stbi_uc)init_code;
6360  }
6361 
6362  // support no starting clear code
6363  avail = clear + 2;
6364  oldcode = -1;
6365 
6366  len = 0;
6367  for (;;) {
6368  if (valid_bits < codesize) {
6369  if (len == 0) {
6370  len = stbi__get8(s); // start new block
6371  if (len == 0)
6372  return g->out;
6373  }
6374  --len;
6375  bits |= (stbi__int32)stbi__get8(s) << valid_bits;
6376  valid_bits += 8;
6377  }
6378  else {
6379  stbi__int32 code = bits & codemask;
6380  bits >>= codesize;
6381  valid_bits -= codesize;
6382  // @OPTIMIZE: is there some way we can accelerate the non-clear path?
6383  if (code == clear) { // clear code
6384  codesize = lzw_cs + 1;
6385  codemask = (1 << codesize) - 1;
6386  avail = clear + 2;
6387  oldcode = -1;
6388  first = 0;
6389  }
6390  else if (code == clear + 1) { // end of stream code
6391  stbi__skip(s, len);
6392  while ((len = stbi__get8(s)) > 0)
6393  stbi__skip(s, len);
6394  return g->out;
6395  }
6396  else if (code <= avail) {
6397  if (first) {
6398  return stbi__errpuc("no clear code", "Corrupt GIF");
6399  }
6400 
6401  if (oldcode >= 0) {
6402  p = &g->codes[avail++];
6403  if (avail > 8192) {
6404  return stbi__errpuc("too many codes", "Corrupt GIF");
6405  }
6406 
6407  p->prefix = (stbi__int16)oldcode;
6408  p->first = g->codes[oldcode].first;
6409  p->suffix = (code == avail) ? p->first : g->codes[code].first;
6410  }
6411  else if (code == avail)
6412  return stbi__errpuc("illegal code in raster", "Corrupt GIF");
6413 
6414  stbi__out_gif_code(g, (stbi__uint16)code);
6415 
6416  if ((avail & codemask) == 0 && avail <= 0x0FFF) {
6417  codesize++;
6418  codemask = (1 << codesize) - 1;
6419  }
6420 
6421  oldcode = code;
6422  }
6423  else {
6424  return stbi__errpuc("illegal code in raster", "Corrupt GIF");
6425  }
6426  }
6427  }
6428 }
6429 
6430 // this function is designed to support animated gifs, although stb_image doesn't support it
6431 // two back is the image from two frames ago, used for a very specific disposal format
6432 static stbi_uc* stbi__gif_load_next(stbi__context* s, stbi__gif* g, int* comp, int req_comp, stbi_uc* two_back)
6433 {
6434  int dispose;
6435  int first_frame;
6436  int piv;
6437  int pcount;
6438 
6439  // on first frame, any non-written pixels get the background colour (non-transparent)
6440  first_frame = 0;
6441  if (g->out == 0) {
6442  if (!stbi__gif_header(s, g, comp, 0)) return 0; // stbi__g_failure_reason set by stbi__gif_header
6443  g->out = (stbi_uc*)stbi__malloc(4 * g->w * g->h);
6444  g->background = (stbi_uc*)stbi__malloc(4 * g->w * g->h);
6445  g->history = (stbi_uc*)stbi__malloc(g->w * g->h);
6446  if (g->out == 0) return stbi__errpuc("outofmem", "Out of memory");
6447 
6448  // image is treated as "tranparent" at the start - ie, nothing overwrites the current background;
6449  // background colour is only used for pixels that are not rendered first frame, after that "background"
6450  // color refers to teh color that was there the previous frame.
6451  memset(g->out, 0x00, 4 * g->w * g->h);
6452  memset(g->background, 0x00, 4 * g->w * g->h); // state of the background (starts transparent)
6453  memset(g->history, 0x00, g->w * g->h); // pixels that were affected previous frame
6454  first_frame = 1;
6455  }
6456  else {
6457  // second frame - how do we dispoase of the previous one?
6458  dispose = (g->eflags & 0x1C) >> 2;
6459  pcount = g->w * g->h;
6460 
6461  if ((dispose == 3) && (two_back == 0)) {
6462  dispose = 2; // if I don't have an image to revert back to, default to the old background
6463  }
6464 
6465  if (dispose == 3) { // use previous graphic
6466  for (piv = 0; piv < pcount; ++piv) {
6467  if (g->history[piv]) {
6468  memcpy(&g->out[piv * 4], &two_back[piv * 4], 4);
6469  }
6470  }
6471  }
6472  else if (dispose == 2) {
6473  // restore what was changed last frame to background before that frame;
6474  for (piv = 0; piv < pcount; ++piv) {
6475  if (g->history[piv]) {
6476  memcpy(&g->out[piv * 4], &g->background[piv * 4], 4);
6477  }
6478  }
6479  }
6480  else {
6481  // This is a non-disposal case eithe way, so just
6482  // leave the pixels as is, and they will become the new background
6483  // 1: do not dispose
6484  // 0: not specified.
6485  }
6486 
6487  // background is what out is after the undoing of the previou frame;
6488  memcpy(g->background, g->out, 4 * g->w * g->h);
6489  }
6490 
6491  // clear my history;
6492  memset(g->history, 0x00, g->w * g->h); // pixels that were affected previous frame
6493 
6494  for (;;) {
6495  int tag = stbi__get8(s);
6496  switch (tag) {
6497  case 0x2C: /* Image Descriptor */
6498  {
6499  stbi__int32 x, y, w, h;
6500  stbi_uc* o;
6501 
6502  x = stbi__get16le(s);
6503  y = stbi__get16le(s);
6504  w = stbi__get16le(s);
6505  h = stbi__get16le(s);
6506  if (((x + w) > (g->w)) || ((y + h) > (g->h)))
6507  return stbi__errpuc("bad Image Descriptor", "Corrupt GIF");
6508 
6509  g->line_size = g->w * 4;
6510  g->start_x = x * 4;
6511  g->start_y = y * g->line_size;
6512  g->max_x = g->start_x + w * 4;
6513  g->max_y = g->start_y + h * g->line_size;
6514  g->cur_x = g->start_x;
6515  g->cur_y = g->start_y;
6516 
6517  g->lflags = stbi__get8(s);
6518 
6519  if (g->lflags & 0x40) {
6520  g->step = 8 * g->line_size; // first interlaced spacing
6521  g->parse = 3;
6522  }
6523  else {
6524  g->step = g->line_size;
6525  g->parse = 0;
6526  }
6527 
6528  if (g->lflags & 0x80) {
6529  stbi__gif_parse_colortable(s, g->lpal, 2 << (g->lflags & 7), g->eflags & 0x01 ? g->transparent : -1);
6530  g->color_table = (stbi_uc*)g->lpal;
6531  }
6532  else if (g->flags & 0x80) {
6533  g->color_table = (stbi_uc*)g->pal;
6534  }
6535  else
6536  return stbi__errpuc("missing color table", "Corrupt GIF");
6537 
6538  o = stbi__process_gif_raster(s, g);
6539  if (o == NULL) return NULL;
6540 
6541  // if this was the first frame,
6542  pcount = g->w * g->h;
6543  if (first_frame && (g->bgindex > 0)) {
6544  // if first frame, any pixel not drawn to gets the background color
6545  for (piv = 0; piv < pcount; ++piv) {
6546  if (g->history[piv] == 0) {
6547  g->pal[g->bgindex][3] = 255; // just in case it was made transparent, undo that; It will be reset next frame if need be;
6548  memcpy(&g->out[piv * 4], &g->pal[g->bgindex], 4);
6549  }
6550  }
6551  }
6552 
6553  return o;
6554  }
6555 
6556  case 0x21: // Comment Extension.
6557  {
6558  int len;
6559  int ext = stbi__get8(s);
6560  if (ext == 0xF9) { // Graphic Control Extension.
6561  len = stbi__get8(s);
6562  if (len == 4) {
6563  g->eflags = stbi__get8(s);
6564  g->delay = 10 * stbi__get16le(s); // delay - 1/100th of a second, saving as 1/1000ths.
6565 
6566  // unset old transparent
6567  if (g->transparent >= 0) {
6568  g->pal[g->transparent][3] = 255;
6569  }
6570  if (g->eflags & 0x01) {
6571  g->transparent = stbi__get8(s);
6572  if (g->transparent >= 0) {
6573  g->pal[g->transparent][3] = 0;
6574  }
6575  }
6576  else {
6577  // don't need transparent
6578  stbi__skip(s, 1);
6579  g->transparent = -1;
6580  }
6581  }
6582  else {
6583  stbi__skip(s, len);
6584  break;
6585  }
6586  }
6587  while ((len = stbi__get8(s)) != 0) {
6588  stbi__skip(s, len);
6589  }
6590  break;
6591  }
6592 
6593  case 0x3B: // gif stream termination code
6594  return (stbi_uc*)s; // using '1' causes warning on some compilers
6595 
6596  default:
6597  return stbi__errpuc("unknown code", "Corrupt GIF");
6598  }
6599  }
6600 }
6601 
6602 static void* stbi__load_gif_main(stbi__context* s, int** delays, int* x, int* y, int* z, int* comp, int req_comp)
6603 {
6604  if (stbi__gif_test(s)) {
6605  int layers = 0;
6606  stbi_uc* u = 0;
6607  stbi_uc* out = 0;
6608  stbi_uc* two_back = 0;
6609  stbi__gif g;
6610  int stride;
6611  memset(&g, 0, sizeof(g));
6612  if (delays) {
6613  *delays = 0;
6614  }
6615 
6616  do {
6617  u = stbi__gif_load_next(s, &g, comp, req_comp, two_back);
6618  if (u == (stbi_uc*)s) u = 0; // end of animated gif marker
6619 
6620  if (u) {
6621  *x = g.w;
6622  *y = g.h;
6623  ++layers;
6624  stride = g.w * g.h * 4;
6625 
6626  if (out) {
6627  out = (stbi_uc*)STBI_REALLOC(out, layers * stride);
6628  if (delays) {
6629  *delays = (int*)STBI_REALLOC(*delays, sizeof(int) * layers);
6630  }
6631  }
6632  else {
6633  out = (stbi_uc*)stbi__malloc(layers * stride);
6634  if (delays) {
6635  *delays = (int*)stbi__malloc(layers * sizeof(int));
6636  }
6637  }
6638  memcpy(out + ((layers - 1) * stride), u, stride);
6639  if (layers >= 2) {
6640  two_back = out - 2 * stride;
6641  }
6642 
6643  if (delays) {
6644  (*delays)[layers - 1U] = g.delay;
6645  }
6646  }
6647  } while (u != 0);
6648 
6649  // free temp buffer;
6650  STBI_FREE(g.out);
6651  STBI_FREE(g.history);
6652  STBI_FREE(g.background);
6653 
6654  // do the final conversion after loading everything;
6655  if (req_comp && req_comp != 4)
6656  out = stbi__convert_format(out, 4, req_comp, layers * g.w, g.h);
6657 
6658  *z = layers;
6659  return out;
6660  }
6661  else {
6662  return stbi__errpuc("not GIF", "Image was not as a gif type.");
6663  }
6664 }
6665 
6666 static void* stbi__gif_load(stbi__context* s, int* x, int* y, int* comp, int req_comp, stbi__result_info* ri)
6667 {
6668  stbi_uc* u = 0;
6669  stbi__gif g;
6670  memset(&g, 0, sizeof(g));
6671 
6672  u = stbi__gif_load_next(s, &g, comp, req_comp, 0);
6673  if (u == (stbi_uc*)s) u = 0; // end of animated gif marker
6674  if (u) {
6675  *x = g.w;
6676  *y = g.h;
6677 
6678  // moved conversion to after successful load so that the same
6679  // can be done for multiple frames.
6680  if (req_comp && req_comp != 4)
6681  u = stbi__convert_format(u, 4, req_comp, g.w, g.h);
6682  }
6683 
6684  // free buffers needed for multiple frame loading;
6685  STBI_FREE(g.history);
6686  STBI_FREE(g.background);
6687 
6688  return u;
6689 }
6690 
6691 static int stbi__gif_info(stbi__context* s, int* x, int* y, int* comp)
6692 {
6693  return stbi__gif_info_raw(s, x, y, comp);
6694 }
6695 #endif
6696 
6697 // *************************************************************************************************
6698 // Radiance RGBE HDR loader
6699 // originally by Nicolas Schulz
6700 #ifndef STBI_NO_HDR
6701 static int stbi__hdr_test_core(stbi__context* s, const char* signature)
6702 {
6703  int i;
6704  for (i = 0; signature[i]; ++i)
6705  if (stbi__get8(s) != signature[i])
6706  return 0;
6707  stbi__rewind(s);
6708  return 1;
6709 }
6710 
6711 static int stbi__hdr_test(stbi__context* s)
6712 {
6713  int r = stbi__hdr_test_core(s, "#?RADIANCE\n");
6714  stbi__rewind(s);
6715  if (!r) {
6716  r = stbi__hdr_test_core(s, "#?RGBE\n");
6717  stbi__rewind(s);
6718  }
6719  return r;
6720 }
6721 
6722 #define STBI__HDR_BUFLEN 1024
6723 static char* stbi__hdr_gettoken(stbi__context * z, char* buffer)
6724 {
6725  int len = 0;
6726  char c = '\0';
6727 
6728  c = (char)stbi__get8(z);
6729 
6730  while (!stbi__at_eof(z) && c != '\n') {
6731  buffer[len++] = c;
6732  if (len == STBI__HDR_BUFLEN - 1) {
6733  // flush to end of line
6734  while (!stbi__at_eof(z) && stbi__get8(z) != '\n')
6735  ;
6736  break;
6737  }
6738  c = (char)stbi__get8(z);
6739  }
6740 
6741  buffer[len] = 0;
6742  return buffer;
6743 }
6744 
6745 static void stbi__hdr_convert(float* output, stbi_uc* input, int req_comp)
6746 {
6747  if (input[3] != 0) {
6748  float f1;
6749  // Exponent
6750  f1 = (float)ldexp(1.0f, input[3] - (int)(128 + 8));
6751  if (req_comp <= 2)
6752  output[0] = (input[0] + input[1] + input[2]) * f1 / 3;
6753  else {
6754  output[0] = input[0] * f1;
6755  output[1] = input[1] * f1;
6756  output[2] = input[2] * f1;
6757  }
6758  if (req_comp == 2) output[1] = 1;
6759  if (req_comp == 4) output[3] = 1;
6760  }
6761  else {
6762  switch (req_comp) {
6763  case 4: output[3] = 1; /* fallthrough */
6764  case 3: output[0] = output[1] = output[2] = 0;
6765  break;
6766  case 2: output[1] = 1; /* fallthrough */
6767  case 1: output[0] = 0;
6768  break;
6769  }
6770  }
6771 }
6772 
6773 static float* stbi__hdr_load(stbi__context* s, int* x, int* y, int* comp, int req_comp, stbi__result_info* ri)
6774 {
6775  char buffer[STBI__HDR_BUFLEN];
6776  char* token;
6777  int valid = 0;
6778  int width, height;
6779  stbi_uc* scanline;
6780  float* hdr_data;
6781  int len;
6782  unsigned char count, value;
6783  int i, j, k, c1, c2, z;
6784  const char* headerToken;
6785  STBI_NOTUSED(ri);
6786 
6787  // Check identifier
6788  headerToken = stbi__hdr_gettoken(s, buffer);
6789  if (strcmp(headerToken, "#?RADIANCE") != 0 && strcmp(headerToken, "#?RGBE") != 0)
6790  return stbi__errpf("not HDR", "Corrupt HDR image");
6791 
6792  // Parse header
6793  for (;;) {
6794  token = stbi__hdr_gettoken(s, buffer);
6795  if (token[0] == 0) break;
6796  if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1;
6797  }
6798 
6799  if (!valid) return stbi__errpf("unsupported format", "Unsupported HDR format");
6800 
6801  // Parse width and height
6802  // can't use sscanf() if we're not using stdio!
6803  token = stbi__hdr_gettoken(s, buffer);
6804  if (strncmp(token, "-Y ", 3)) return stbi__errpf("unsupported data layout", "Unsupported HDR format");
6805  token += 3;
6806  height = (int)strtol(token, &token, 10);
6807  while (*token == ' ') ++token;
6808  if (strncmp(token, "+X ", 3)) return stbi__errpf("unsupported data layout", "Unsupported HDR format");
6809  token += 3;
6810  width = (int)strtol(token, NULL, 10);
6811 
6812  *x = width;
6813  *y = height;
6814 
6815  if (comp)* comp = 3;
6816  if (req_comp == 0) req_comp = 3;
6817 
6818  if (!stbi__mad4sizes_valid(width, height, req_comp, sizeof(float), 0))
6819  return stbi__errpf("too large", "HDR image is too large");
6820 
6821  // Read data
6822  hdr_data = (float*)stbi__malloc_mad4(width, height, req_comp, sizeof(float), 0);
6823  if (!hdr_data)
6824  return stbi__errpf("outofmem", "Out of memory");
6825 
6826  // Load image data
6827  // image data is stored as some number of sca
6828  if (width < 8 || width >= 32768) {
6829  // Read flat data
6830  for (j = 0; j < height; ++j) {
6831  for (i = 0; i < width; ++i) {
6832  stbi_uc rgbe[4];
6833  main_decode_loop:
6834  stbi__getn(s, rgbe, 4);
6835  stbi__hdr_convert(hdr_data + j * width * req_comp + i * req_comp, rgbe, req_comp);
6836  }
6837  }
6838  }
6839  else {
6840  // Read RLE-encoded data
6841  scanline = NULL;
6842 
6843  for (j = 0; j < height; ++j) {
6844  c1 = stbi__get8(s);
6845  c2 = stbi__get8(s);
6846  len = stbi__get8(s);
6847  if (c1 != 2 || c2 != 2 || (len & 0x80)) {
6848  // not run-length encoded, so we have to actually use THIS data as a decoded
6849  // pixel (note this can't be a valid pixel--one of RGB must be >= 128)
6850  stbi_uc rgbe[4];
6851  rgbe[0] = (stbi_uc)c1;
6852  rgbe[1] = (stbi_uc)c2;
6853  rgbe[2] = (stbi_uc)len;
6854  rgbe[3] = (stbi_uc)stbi__get8(s);
6855  stbi__hdr_convert(hdr_data, rgbe, req_comp);
6856  i = 1;
6857  j = 0;
6858  STBI_FREE(scanline);
6859  goto main_decode_loop; // yes, this makes no sense
6860  }
6861  len <<= 8;
6862  len |= stbi__get8(s);
6863  if (len != width) { STBI_FREE(hdr_data); STBI_FREE(scanline); return stbi__errpf("invalid decoded scanline length", "corrupt HDR"); }
6864  if (scanline == NULL) {
6865  scanline = (stbi_uc*)stbi__malloc_mad2(width, 4, 0);
6866  if (!scanline) {
6867  STBI_FREE(hdr_data);
6868  return stbi__errpf("outofmem", "Out of memory");
6869  }
6870  }
6871 
6872  for (k = 0; k < 4; ++k) {
6873  int nleft;
6874  i = 0;
6875  while ((nleft = width - i) > 0) {
6876  count = stbi__get8(s);
6877  if (count > 128) {
6878  // Run
6879  value = stbi__get8(s);
6880  count -= 128;
6881  if (count > nleft) { STBI_FREE(hdr_data); STBI_FREE(scanline); return stbi__errpf("corrupt", "bad RLE data in HDR"); }
6882  for (z = 0; z < count; ++z)
6883  scanline[i++ * 4 + k] = value;
6884  }
6885  else {
6886  // Dump
6887  if (count > nleft) { STBI_FREE(hdr_data); STBI_FREE(scanline); return stbi__errpf("corrupt", "bad RLE data in HDR"); }
6888  for (z = 0; z < count; ++z)
6889  scanline[i++ * 4 + k] = stbi__get8(s);
6890  }
6891  }
6892  }
6893  for (i = 0; i < width; ++i)
6894  stbi__hdr_convert(hdr_data + (j * width + i) * req_comp, scanline + i * 4, req_comp);
6895  }
6896  if (scanline)
6897  STBI_FREE(scanline);
6898  }
6899 
6900  return hdr_data;
6901 }
6902 
6903 static int stbi__hdr_info(stbi__context* s, int* x, int* y, int* comp)
6904 {
6905  char buffer[STBI__HDR_BUFLEN];
6906  char* token;
6907  int valid = 0;
6908  int dummy;
6909 
6910  if (!x) x = &dummy;
6911  if (!y) y = &dummy;
6912  if (!comp) comp = &dummy;
6913 
6914  if (stbi__hdr_test(s) == 0) {
6915  stbi__rewind(s);
6916  return 0;
6917  }
6918 
6919  for (;;) {
6920  token = stbi__hdr_gettoken(s, buffer);
6921  if (token[0] == 0) break;
6922  if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1;
6923  }
6924 
6925  if (!valid) {
6926  stbi__rewind(s);
6927  return 0;
6928  }
6929  token = stbi__hdr_gettoken(s, buffer);
6930  if (strncmp(token, "-Y ", 3)) {
6931  stbi__rewind(s);
6932  return 0;
6933  }
6934  token += 3;
6935  *y = (int)strtol(token, &token, 10);
6936  while (*token == ' ') ++token;
6937  if (strncmp(token, "+X ", 3)) {
6938  stbi__rewind(s);
6939  return 0;
6940  }
6941  token += 3;
6942  *x = (int)strtol(token, NULL, 10);
6943  *comp = 3;
6944  return 1;
6945 }
6946 #endif // STBI_NO_HDR
6947 
6948 #ifndef STBI_NO_BMP
6949 static int stbi__bmp_info(stbi__context* s, int* x, int* y, int* comp)
6950 {
6951  void* p;
6952  stbi__bmp_data info;
6953 
6954  info.all_a = 255;
6955  p = stbi__bmp_parse_header(s, &info);
6956  stbi__rewind(s);
6957  if (p == NULL)
6958  return 0;
6959  if (x)* x = s->img_x;
6960  if (y)* y = s->img_y;
6961  if (comp)* comp = info.ma ? 4 : 3;
6962  return 1;
6963 }
6964 #endif
6965 
6966 #ifndef STBI_NO_PSD
6967 static int stbi__psd_info(stbi__context* s, int* x, int* y, int* comp)
6968 {
6969  int channelCount, dummy, depth;
6970  if (!x) x = &dummy;
6971  if (!y) y = &dummy;
6972  if (!comp) comp = &dummy;
6973  if (stbi__get32be(s) != 0x38425053) {
6974  stbi__rewind(s);
6975  return 0;
6976  }
6977  if (stbi__get16be(s) != 1) {
6978  stbi__rewind(s);
6979  return 0;
6980  }
6981  stbi__skip(s, 6);
6982  channelCount = stbi__get16be(s);
6983  if (channelCount < 0 || channelCount > 16) {
6984  stbi__rewind(s);
6985  return 0;
6986  }
6987  *y = stbi__get32be(s);
6988  *x = stbi__get32be(s);
6989  depth = stbi__get16be(s);
6990  if (depth != 8 && depth != 16) {
6991  stbi__rewind(s);
6992  return 0;
6993  }
6994  if (stbi__get16be(s) != 3) {
6995  stbi__rewind(s);
6996  return 0;
6997  }
6998  *comp = 4;
6999  return 1;
7000 }
7001 
7002 static int stbi__psd_is16(stbi__context* s)
7003 {
7004  int channelCount, depth;
7005  if (stbi__get32be(s) != 0x38425053) {
7006  stbi__rewind(s);
7007  return 0;
7008  }
7009  if (stbi__get16be(s) != 1) {
7010  stbi__rewind(s);
7011  return 0;
7012  }
7013  stbi__skip(s, 6);
7014  channelCount = stbi__get16be(s);
7015  if (channelCount < 0 || channelCount > 16) {
7016  stbi__rewind(s);
7017  return 0;
7018  }
7019  (void)stbi__get32be(s);
7020  (void)stbi__get32be(s);
7021  depth = stbi__get16be(s);
7022  if (depth != 16) {
7023  stbi__rewind(s);
7024  return 0;
7025  }
7026  return 1;
7027 }
7028 #endif
7029 
7030 #ifndef STBI_NO_PIC
7031 static int stbi__pic_info(stbi__context* s, int* x, int* y, int* comp)
7032 {
7033  int act_comp = 0, num_packets = 0, chained, dummy;
7034  stbi__pic_packet packets[10];
7035 
7036  if (!x) x = &dummy;
7037  if (!y) y = &dummy;
7038  if (!comp) comp = &dummy;
7039 
7040  if (!stbi__pic_is4(s, "\x53\x80\xF6\x34")) {
7041  stbi__rewind(s);
7042  return 0;
7043  }
7044 
7045  stbi__skip(s, 88);
7046 
7047  *x = stbi__get16be(s);
7048  *y = stbi__get16be(s);
7049  if (stbi__at_eof(s)) {
7050  stbi__rewind(s);
7051  return 0;
7052  }
7053  if ((*x) != 0 && (1 << 28) / (*x) < (*y)) {
7054  stbi__rewind(s);
7055  return 0;
7056  }
7057 
7058  stbi__skip(s, 8);
7059 
7060  do {
7061  stbi__pic_packet* packet;
7062 
7063  if (num_packets == sizeof(packets) / sizeof(packets[0]))
7064  return 0;
7065 
7066  packet = &packets[num_packets++];
7067  chained = stbi__get8(s);
7068  packet->size = stbi__get8(s);
7069  packet->type = stbi__get8(s);
7070  packet->channel = stbi__get8(s);
7071  act_comp |= packet->channel;
7072 
7073  if (stbi__at_eof(s)) {
7074  stbi__rewind(s);
7075  return 0;
7076  }
7077  if (packet->size != 8) {
7078  stbi__rewind(s);
7079  return 0;
7080  }
7081  } while (chained);
7082 
7083  *comp = (act_comp & 0x10 ? 4 : 3);
7084 
7085  return 1;
7086 }
7087 #endif
7088 
7089 // *************************************************************************************************
7090 // Portable Gray Map and Portable Pixel Map loader
7091 // by Ken Miller
7092 //
7093 // PGM: http://netpbm.sourceforge.net/doc/pgm.html
7094 // PPM: http://netpbm.sourceforge.net/doc/ppm.html
7095 //
7096 // Known limitations:
7097 // Does not support comments in the header section
7098 // Does not support ASCII image data (formats P2 and P3)
7099 // Does not support 16-bit-per-channel
7100 
7101 #ifndef STBI_NO_PNM
7102 
7103 static int stbi__pnm_test(stbi__context* s)
7104 {
7105  char p, t;
7106  p = (char)stbi__get8(s);
7107  t = (char)stbi__get8(s);
7108  if (p != 'P' || (t != '5' && t != '6')) {
7109  stbi__rewind(s);
7110  return 0;
7111  }
7112  return 1;
7113 }
7114 
7115 static void* stbi__pnm_load(stbi__context* s, int* x, int* y, int* comp, int req_comp, stbi__result_info* ri)
7116 {
7117  stbi_uc* out;
7118  STBI_NOTUSED(ri);
7119 
7120  if (!stbi__pnm_info(s, (int*)& s->img_x, (int*)& s->img_y, (int*)& s->img_n))
7121  return 0;
7122 
7123  *x = s->img_x;
7124  *y = s->img_y;
7125  if (comp)* comp = s->img_n;
7126 
7127  if (!stbi__mad3sizes_valid(s->img_n, s->img_x, s->img_y, 0))
7128  return stbi__errpuc("too large", "PNM too large");
7129 
7130  out = (stbi_uc*)stbi__malloc_mad3(s->img_n, s->img_x, s->img_y, 0);
7131  if (!out) return stbi__errpuc("outofmem", "Out of memory");
7132  stbi__getn(s, out, s->img_n * s->img_x * s->img_y);
7133 
7134  if (req_comp && req_comp != s->img_n) {
7135  out = stbi__convert_format(out, s->img_n, req_comp, s->img_x, s->img_y);
7136  if (out == NULL) return out; // stbi__convert_format frees input on failure
7137  }
7138  return out;
7139 }
7140 
7141 static int stbi__pnm_isspace(char c)
7142 {
7143  return c == ' ' || c == '\t' || c == '\n' || c == '\v' || c == '\f' || c == '\r';
7144 }
7145 
7146 static void stbi__pnm_skip_whitespace(stbi__context* s, char* c)
7147 {
7148  for (;;) {
7149  while (!stbi__at_eof(s) && stbi__pnm_isspace(*c))
7150  * c = (char)stbi__get8(s);
7151 
7152  if (stbi__at_eof(s) || *c != '#')
7153  break;
7154 
7155  while (!stbi__at_eof(s) && *c != '\n' && *c != '\r')
7156  * c = (char)stbi__get8(s);
7157  }
7158 }
7159 
7160 static int stbi__pnm_isdigit(char c)
7161 {
7162  return c >= '0' && c <= '9';
7163 }
7164 
7165 static int stbi__pnm_getinteger(stbi__context* s, char* c)
7166 {
7167  int value = 0;
7168 
7169  while (!stbi__at_eof(s) && stbi__pnm_isdigit(*c)) {
7170  value = value * 10 + (*c - '0');
7171  *c = (char)stbi__get8(s);
7172  }
7173 
7174  return value;
7175 }
7176 
7177 static int stbi__pnm_info(stbi__context* s, int* x, int* y, int* comp)
7178 {
7179  int maxv, dummy;
7180  char c, p, t;
7181 
7182  if (!x) x = &dummy;
7183  if (!y) y = &dummy;
7184  if (!comp) comp = &dummy;
7185 
7186  stbi__rewind(s);
7187 
7188  // Get identifier
7189  p = (char)stbi__get8(s);
7190  t = (char)stbi__get8(s);
7191  if (p != 'P' || (t != '5' && t != '6')) {
7192  stbi__rewind(s);
7193  return 0;
7194  }
7195 
7196  *comp = (t == '6') ? 3 : 1; // '5' is 1-component .pgm; '6' is 3-component .ppm
7197 
7198  c = (char)stbi__get8(s);
7199  stbi__pnm_skip_whitespace(s, &c);
7200 
7201  *x = stbi__pnm_getinteger(s, &c); // read width
7202  stbi__pnm_skip_whitespace(s, &c);
7203 
7204  *y = stbi__pnm_getinteger(s, &c); // read height
7205  stbi__pnm_skip_whitespace(s, &c);
7206 
7207  maxv = stbi__pnm_getinteger(s, &c); // read max value
7208 
7209  if (maxv > 255)
7210  return stbi__err("max value > 255", "PPM image not 8-bit");
7211  else
7212  return 1;
7213 }
7214 #endif
7215 
7216 static int stbi__info_main(stbi__context* s, int* x, int* y, int* comp)
7217 {
7218 #ifndef STBI_NO_JPEG
7219  if (stbi__jpeg_info(s, x, y, comp)) return 1;
7220 #endif
7221 
7222 #ifndef STBI_NO_PNG
7223  if (stbi__png_info(s, x, y, comp)) return 1;
7224 #endif
7225 
7226 #ifndef STBI_NO_GIF
7227  if (stbi__gif_info(s, x, y, comp)) return 1;
7228 #endif
7229 
7230 #ifndef STBI_NO_BMP
7231  if (stbi__bmp_info(s, x, y, comp)) return 1;
7232 #endif
7233 
7234 #ifndef STBI_NO_PSD
7235  if (stbi__psd_info(s, x, y, comp)) return 1;
7236 #endif
7237 
7238 #ifndef STBI_NO_PIC
7239  if (stbi__pic_info(s, x, y, comp)) return 1;
7240 #endif
7241 
7242 #ifndef STBI_NO_PNM
7243  if (stbi__pnm_info(s, x, y, comp)) return 1;
7244 #endif
7245 
7246 #ifndef STBI_NO_HDR
7247  if (stbi__hdr_info(s, x, y, comp)) return 1;
7248 #endif
7249 
7250  // test tga last because it's a crappy test!
7251 #ifndef STBI_NO_TGA
7252  if (stbi__tga_info(s, x, y, comp))
7253  return 1;
7254 #endif
7255  return stbi__err("unknown image type", "Image not of any known type, or corrupt");
7256 }
7257 
7258 static int stbi__is_16_main(stbi__context* s)
7259 {
7260 #ifndef STBI_NO_PNG
7261  if (stbi__png_is16(s)) return 1;
7262 #endif
7263 
7264 #ifndef STBI_NO_PSD
7265  if (stbi__psd_is16(s)) return 1;
7266 #endif
7267 
7268  return 0;
7269 }
7270 
7271 #ifndef STBI_NO_STDIO
7272 STBIDEF int stbi_info(char const* filename, int* x, int* y, int* comp)
7273 {
7274  FILE* f = stbi__fopen(filename, "rb");
7275  int result;
7276  if (!f) return stbi__err("can't fopen", "Unable to open file");
7277  result = stbi_info_from_file(f, x, y, comp);
7278  fclose(f);
7279  return result;
7280 }
7281 
7282 STBIDEF int stbi_info_from_file(FILE* f, int* x, int* y, int* comp)
7283 {
7284  int r;
7285  stbi__context s;
7286  long pos = ftell(f);
7287  stbi__start_file(&s, f);
7288  r = stbi__info_main(&s, x, y, comp);
7289  fseek(f, pos, SEEK_SET);
7290  return r;
7291 }
7292 
7293 STBIDEF int stbi_is_16_bit(char const* filename)
7294 {
7295  FILE* f = stbi__fopen(filename, "rb");
7296  int result;
7297  if (!f) return stbi__err("can't fopen", "Unable to open file");
7298  result = stbi_is_16_bit_from_file(f);
7299  fclose(f);
7300  return result;
7301 }
7302 
7303 STBIDEF int stbi_is_16_bit_from_file(FILE* f)
7304 {
7305  int r;
7306  stbi__context s;
7307  long pos = ftell(f);
7308  stbi__start_file(&s, f);
7309  r = stbi__is_16_main(&s);
7310  fseek(f, pos, SEEK_SET);
7311  return r;
7312 }
7313 #endif // !STBI_NO_STDIO
7314 
7315 STBIDEF int stbi_info_from_memory(stbi_uc const* buffer, int len, int* x, int* y, int* comp)
7316 {
7317  stbi__context s;
7318  stbi__start_mem(&s, buffer, len);
7319  return stbi__info_main(&s, x, y, comp);
7320 }
7321 
7322 STBIDEF int stbi_info_from_callbacks(stbi_io_callbacks const* c, void* user, int* x, int* y, int* comp)
7323 {
7324  stbi__context s;
7325  stbi__start_callbacks(&s, (stbi_io_callbacks*)c, user);
7326  return stbi__info_main(&s, x, y, comp);
7327 }
7328 
7329 STBIDEF int stbi_is_16_bit_from_memory(stbi_uc const* buffer, int len)
7330 {
7331  stbi__context s;
7332  stbi__start_mem(&s, buffer, len);
7333  return stbi__is_16_main(&s);
7334 }
7335 
7336 STBIDEF int stbi_is_16_bit_from_callbacks(stbi_io_callbacks const* c, void* user)
7337 {
7338  stbi__context s;
7339  stbi__start_callbacks(&s, (stbi_io_callbacks*)c, user);
7340  return stbi__is_16_main(&s);
7341 }
7342 
7343 #endif // STB_IMAGE_IMPLEMENTATION
7344 
7345 /*
7346 revision history:
7347 2.19 (2018-02-11) fix warning
7348 2.18 (2018-01-30) fix warnings
7349 2.17 (2018-01-29) change sbti__shiftsigned to avoid clang -O2 bug
7350 1-bit BMP
7351 *_is_16_bit api
7352 avoid warnings
7353 2.16 (2017-07-23) all functions have 16-bit variants;
7354 STBI_NO_STDIO works again;
7355 compilation fixes;
7356 fix rounding in unpremultiply;
7357 optimize vertical flip;
7358 disable raw_len validation;
7359 documentation fixes
7360 2.15 (2017-03-18) fix png-1,2,4 bug; now all Imagenet JPGs decode;
7361 warning fixes; disable run-time SSE detection on gcc;
7362 uniform handling of optional "return" values;
7363 thread-safe initialization of zlib tables
7364 2.14 (2017-03-03) remove deprecated STBI_JPEG_OLD; fixes for Imagenet JPGs
7365 2.13 (2016-11-29) add 16-bit API, only supported for PNG right now
7366 2.12 (2016-04-02) fix typo in 2.11 PSD fix that caused crashes
7367 2.11 (2016-04-02) allocate large structures on the stack
7368 remove white matting for transparent PSD
7369 fix reported channel count for PNG & BMP
7370 re-enable SSE2 in non-gcc 64-bit
7371 support RGB-formatted JPEG
7372 read 16-bit PNGs (only as 8-bit)
7373 2.10 (2016-01-22) avoid warning introduced in 2.09 by STBI_REALLOC_SIZED
7374 2.09 (2016-01-16) allow comments in PNM files
7375 16-bit-per-pixel TGA (not bit-per-component)
7376 info() for TGA could break due to .hdr handling
7377 info() for BMP to shares code instead of sloppy parse
7378 can use STBI_REALLOC_SIZED if allocator doesn't support realloc
7379 code cleanup
7380 2.08 (2015-09-13) fix to 2.07 cleanup, reading RGB PSD as RGBA
7381 2.07 (2015-09-13) fix compiler warnings
7382 partial animated GIF support
7383 limited 16-bpc PSD support
7384 #ifdef unused functions
7385 bug with < 92 byte PIC,PNM,HDR,TGA
7386 2.06 (2015-04-19) fix bug where PSD returns wrong '*comp' value
7387 2.05 (2015-04-19) fix bug in progressive JPEG handling, fix warning
7388 2.04 (2015-04-15) try to re-enable SIMD on MinGW 64-bit
7389 2.03 (2015-04-12) extra corruption checking (mmozeiko)
7390 stbi_set_flip_vertically_on_load (nguillemot)
7391 fix NEON support; fix mingw support
7392 2.02 (2015-01-19) fix incorrect assert, fix warning
7393 2.01 (2015-01-17) fix various warnings; suppress SIMD on gcc 32-bit without -msse2
7394 2.00b (2014-12-25) fix STBI_MALLOC in progressive JPEG
7395 2.00 (2014-12-25) optimize JPG, including x86 SSE2 & NEON SIMD (ryg)
7396 progressive JPEG (stb)
7397 PGM/PPM support (Ken Miller)
7398 STBI_MALLOC,STBI_REALLOC,STBI_FREE
7399 GIF bugfix -- seemingly never worked
7400 STBI_NO_*, STBI_ONLY_*
7401 1.48 (2014-12-14) fix incorrectly-named assert()
7402 1.47 (2014-12-14) 1/2/4-bit PNG support, both direct and paletted (Omar Cornut & stb)
7403 optimize PNG (ryg)
7404 fix bug in interlaced PNG with user-specified channel count (stb)
7405 1.46 (2014-08-26)
7406 fix broken tRNS chunk (colorkey-style transparency) in non-paletted PNG
7407 1.45 (2014-08-16)
7408 fix MSVC-ARM internal compiler error by wrapping malloc
7409 1.44 (2014-08-07)
7410 various warning fixes from Ronny Chevalier
7411 1.43 (2014-07-15)
7412 fix MSVC-only compiler problem in code changed in 1.42
7413 1.42 (2014-07-09)
7414 don't define _CRT_SECURE_NO_WARNINGS (affects user code)
7415 fixes to stbi__cleanup_jpeg path
7416 added STBI_ASSERT to avoid requiring assert.h
7417 1.41 (2014-06-25)
7418 fix search&replace from 1.36 that messed up comments/error messages
7419 1.40 (2014-06-22)
7420 fix gcc struct-initialization warning
7421 1.39 (2014-06-15)
7422 fix to TGA optimization when req_comp != number of components in TGA;
7423 fix to GIF loading because BMP wasn't rewinding (whoops, no GIFs in my test suite)
7424 add support for BMP version 5 (more ignored fields)
7425 1.38 (2014-06-06)
7426 suppress MSVC warnings on integer casts truncating values
7427 fix accidental rename of 'skip' field of I/O
7428 1.37 (2014-06-04)
7429 remove duplicate typedef
7430 1.36 (2014-06-03)
7431 convert to header file single-file library
7432 if de-iphone isn't set, load iphone images color-swapped instead of returning NULL
7433 1.35 (2014-05-27)
7434 various warnings
7435 fix broken STBI_SIMD path
7436 fix bug where stbi_load_from_file no longer left file pointer in correct place
7437 fix broken non-easy path for 32-bit BMP (possibly never used)
7438 TGA optimization by Arseny Kapoulkine
7439 1.34 (unknown)
7440 use STBI_NOTUSED in stbi__resample_row_generic(), fix one more leak in tga failure case
7441 1.33 (2011-07-14)
7442 make stbi_is_hdr work in STBI_NO_HDR (as specified), minor compiler-friendly improvements
7443 1.32 (2011-07-13)
7444 support for "info" function for all supported filetypes (SpartanJ)
7445 1.31 (2011-06-20)
7446 a few more leak fixes, bug in PNG handling (SpartanJ)
7447 1.30 (2011-06-11)
7448 added ability to load files via callbacks to accomidate custom input streams (Ben Wenger)
7449 removed deprecated format-specific test/load functions
7450 removed support for installable file formats (stbi_loader) -- would have been broken for IO callbacks anyway
7451 error cases in bmp and tga give messages and don't leak (Raymond Barbiero, grisha)
7452 fix inefficiency in decoding 32-bit BMP (David Woo)
7453 1.29 (2010-08-16)
7454 various warning fixes from Aurelien Pocheville
7455 1.28 (2010-08-01)
7456 fix bug in GIF palette transparency (SpartanJ)
7457 1.27 (2010-08-01)
7458 cast-to-stbi_uc to fix warnings
7459 1.26 (2010-07-24)
7460 fix bug in file buffering for PNG reported by SpartanJ
7461 1.25 (2010-07-17)
7462 refix trans_data warning (Won Chun)
7463 1.24 (2010-07-12)
7464 perf improvements reading from files on platforms with lock-heavy fgetc()
7465 minor perf improvements for jpeg
7466 deprecated type-specific functions so we'll get feedback if they're needed
7467 attempt to fix trans_data warning (Won Chun)
7468 1.23 fixed bug in iPhone support
7469 1.22 (2010-07-10)
7470 removed image *writing* support
7471 stbi_info support from Jetro Lauha
7472 GIF support from Jean-Marc Lienher
7473 iPhone PNG-extensions from James Brown
7474 warning-fixes from Nicolas Schulz and Janez Zemva (i.stbi__err. Janez (U+017D)emva)
7475 1.21 fix use of 'stbi_uc' in header (reported by jon blow)
7476 1.20 added support for Softimage PIC, by Tom Seddon
7477 1.19 bug in interlaced PNG corruption check (found by ryg)
7478 1.18 (2008-08-02)
7479 fix a threading bug (local mutable static)
7480 1.17 support interlaced PNG
7481 1.16 major bugfix - stbi__convert_format converted one too many pixels
7482 1.15 initialize some fields for thread safety
7483 1.14 fix threadsafe conversion bug
7484 header-file-only version (#define STBI_HEADER_FILE_ONLY before including)
7485 1.13 threadsafe
7486 1.12 const qualifiers in the API
7487 1.11 Support installable IDCT, colorspace conversion routines
7488 1.10 Fixes for 64-bit (don't use "unsigned long")
7489 optimized upsampling by Fabian "ryg" Giesen
7490 1.09 Fix format-conversion for PSD code (bad global variables!)
7491 1.08 Thatcher Ulrich's PSD code integrated by Nicolas Schulz
7492 1.07 attempt to fix C++ warning/errors again
7493 1.06 attempt to fix C++ warning/errors again
7494 1.05 fix TGA loading to return correct *comp and use good luminance calc
7495 1.04 default float alpha is 1, not 255; use 'void *' for stbi_image_free
7496 1.03 bugfixes to STBI_NO_STDIO, STBI_NO_HDR
7497 1.02 support for (subset of) HDR files, float interface for preferred access to them
7498 1.01 fix bug: possible bug in handling right-side up bmps... not sure
7499 fix bug: the stbi__bmp_load() and stbi__tga_load() functions didn't work at all
7500 1.00 interface to zlib that skips zlib header
7501 0.99 correct handling of alpha in palette
7502 0.98 TGA loader by lonesock; dynamically add loaders (untested)
7503 0.97 jpeg errors on too large a file; also catch another malloc failure
7504 0.96 fix detection of invalid v value - particleman@mollyrocket forum
7505 0.95 during header scan, seek to markers in case of padding
7506 0.94 STBI_NO_STDIO to disable stdio usage; rename all #defines the same
7507 0.93 handle jpegtran output; verbose errors
7508 0.92 read 4,8,16,24,32-bit BMP files of several formats
7509 0.91 output 24-bit Windows 3.0 BMP files
7510 0.90 fix a few more warnings; bump version number to approach 1.0
7511 0.61 bugfixes due to Marc LeBlanc, Christopher Lloyd
7512 0.60 fix compiling as c++
7513 0.59 fix warnings: merge Dave Moore's -Wall fixes
7514 0.58 fix bug: zlib uncompressed mode len/nlen was wrong endian
7515 0.57 fix bug: jpg last huffman symbol before marker was >9 bits but less than 16 available
7516 0.56 fix bug: zlib uncompressed mode len vs. nlen
7517 0.55 fix bug: restart_interval not initialized to 0
7518 0.54 allow NULL for 'int *comp'
7519 0.53 fix bug in png 3->4; speedup png decoding
7520 0.52 png handles req_comp=3,4 directly; minor cleanup; jpeg comments
7521 0.51 obey req_comp requests, 1-component jpegs return as 1-component,
7522 on 'test' only check type, not whether we support this variant
7523 0.50 (2006-11-19)
7524 first released version
7525 */
7526 
7527 
7528 /*
7529 ------------------------------------------------------------------------------
7530 This software is available under 2 licenses -- choose whichever you prefer.
7531 ------------------------------------------------------------------------------
7532 ALTERNATIVE A - MIT License
7533 Copyright (c) 2017 Sean Barrett
7534 Permission is hereby granted, free of charge, to any person obtaining a copy of
7535 this software and associated documentation files (the "Software"), to deal in
7536 the Software without restriction, including without limitation the rights to
7537 use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
7538 of the Software, and to permit persons to whom the Software is furnished to do
7539 so, subject to the following conditions:
7540 The above copyright notice and this permission notice shall be included in all
7541 copies or substantial portions of the Software.
7542 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
7543 IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
7544 FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
7545 AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
7546 LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
7547 OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
7548 SOFTWARE.
7549 ------------------------------------------------------------------------------
7550 ALTERNATIVE B - Public Domain (www.unlicense.org)
7551 This is free and unencumbered software released into the public domain.
7552 Anyone is free to copy, modify, publish, use, compile, sell, or distribute this
7553 software, either in source code form or as a compiled binary, for any purpose,
7554 commercial or non-commercial, and by any means.
7555 In jurisdictions that recognize copyright laws, the author or authors of this
7556 software dedicate any and all copyright interest in the software to the public
7557 domain. We make this dedication for the benefit of the public at large and to
7558 the detriment of our heirs and successors. We intend this dedication to be an
7559 overt act of relinquishment in perpetuity of all present and future rights to
7560 this software under copyright law.
7561 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
7562 IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
7563 FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
7564 AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
7565 ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
7566 WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
7567 ------------------------------------------------------------------------------
7568 */
Definition: stb_image.h:345