Production-focused notes from Android performance work on real apps and SDKs between 2023 and 2025. This is not a generic checklist. It is a compact record of the problems I have debugged, the changes I made, and the results those changes produced in startup, ANR, memory, networking, and SDK-heavy environments.
- Reduced cold start latency by restructuring initialization and moving non-critical SDK work off the main thread.
- Lowered ANR risk by removing main-thread I/O, isolating timing work, and reducing startup queue pressure on the main
Looper. - Improved memory stability by switching large-response handling from full buffering to streamed parsing.
- Added memory guardrails that stop non-essential work before heap pressure turns into OOM.
- Reduced APK size by removing unnecessary serialization dependencies and converting assets to lighter formats.
- Improved network efficiency by warming connections early and skipping unneeded ad resources.
- Built thread-pool isolation for config, ads, I/O, and scheduled work so unrelated tasks stop blocking each other.
- Android engineers working on production apps, not toy projects.
- Engineers debugging ANRs, cold start regressions, crash spikes, or OOM issues.
- Mobile engineers maintaining apps with heavy SDK integration.
- Engineers building SDKs or shared mobile infrastructure used across multiple apps or teams.
Problem
- Cold start was unstable.
- Startup ANRs were tied to storage reads and third-party SDK init.
What caused it
SharedPreferencesand SDK initialization were both sitting on the critical startup path.- Several small startup tasks were competing on the main thread, which increased message queue pressure and delayed first-frame work.
What I changed
- Migrated hot config reads from
SharedPreferencesto MMKV. - Split cold-start reads into stages instead of loading everything up front.
- Moved OAID/GID, analytics, browser components, and ad SDK startup off the main thread where integration rules allowed it.
- Used
MessageQueue.IdleHandlerfor work that could wait until the first burst of startup work finished. - Moved splash countdown handling to a
HandlerThreadso timing work stopped competing with rendering.
Result
- Configuration loading on the critical path improved by about 60x.
- Startup became more predictable because blocking work and deferred work were separated clearly.
- Main-thread stalls dropped because cold start no longer mixed storage, SDK init, and timing work in one lane.
Problem
- Large responses created avoidable memory pressure and increased OOM risk on low-memory devices.
What caused it
- Response parsing used
bytes(), which loaded the full payload into memory before parsing started. - Work kept running even when heap usage was already close to the limit.
What I changed
- Replaced
bytes()with streamed parsing fromsource(). - Added an OOM circuit breaker to stop non-essential work when heap usage exceeded 85% or system free memory dropped below 50 MB.
- Treated memory protection as part of task scheduling, not only as a parsing fix.
Result
- Peak allocation pressure dropped in large-response flows.
- Memory handling became more defensive under stress instead of failing late.
- The risky paths became more stable on low-end and low-memory devices.
- Cold start had too much work on the critical path.
- ANRs were showing up during startup and app backgrounding.
- Third-party SDK integration added I/O and initialization cost at the worst possible time.
SharedPreferencesreads were blocking the startup path.- OAID/GID, analytics, browser components, and ad startup all competed on the main thread.
- Repeated framework queries and splash-related timing logic added avoidable work to the main
Looper.
- Replaced hot-path
SharedPreferencesaccess with MMKV to reduce storage overhead during startup. - Split startup config reads into phases so only required data loaded before first screen.
- Merged config sources to reduce read amplification during cold start.
- Moved OAID/GID, analytics, and browser initialization onto background threads.
- Shifted ad SDK
initandstartwork off the main thread where possible. - Deferred non-critical startup work with
MessageQueue.IdleHandler. - Cached ad views before display to reduce UI-thread work during splash rendering.
- Cached
DisplayMetricsto avoid repeated framework lookups. - Moved splash countdown scheduling to a dedicated
HandlerThread. - Let the host app perform main-process checks to avoid repeating that work inside the SDK.
- Reduced startup blocking by cutting I/O and SDK work from the first-frame path.
- Lowered ANR risk by reducing contention on the main thread.
- Made startup behavior easier to debug because initialization now had clear phases and ownership.
- Large payloads created temporary objects that pushed the app toward OOM.
- Some crash paths were tied to memory pressure and oversized Binder transactions.
- Destroyed screens could retain more view state than they should.
Retrofitparsing usedbytes(), which forced full in-memory buffering.- Background work kept running even when the process was already under memory pressure.
- WebView-related flows could hit
TransactionTooLargeException. - View hierarchies were not always cleared aggressively enough on teardown.
- Switched from
bytes()tosource()so parsing could stream instead of buffering the whole response. - Added an OOM circuit breaker that stops non-essential tasks above 85% heap usage or below 50 MB of free system memory.
- Used a static-holder workaround in the WebView path to avoid Binder-heavy state transfer.
- Recursively removed view references up to three levels deep during
ActivityandFragmentdestruction.
- Reduced large-object allocation in response parsing.
- Lowered the chance that memory spikes would turn into process death.
- Improved stability in heavy-page and WebView-related flows.
- Reduced leak-prone retained references during lifecycle teardown.
- The app carried dependency and asset weight it did not need.
- Network flows downloaded resources that were not always used.
- Connection setup cost showed up too late in user-critical flows.
- Serialization dependencies pulled in extra code and transitive weight.
- PNG assets were heavier than necessary.
- Resource download logic was not strict enough for media-specific ad flows.
- DNS, TCP, and TLS setup happened only when the real request started.
- Replaced the Avro-based path with Gson where the simpler stack was sufficient.
- Removed Jackson from the dependency graph to cut package size.
- Converted suitable PNG assets to WebP.
- Added preconnect with HTTP
HEADto establish connections earlier. - Skipped image downloads for video ads when the flow only required video resources.
- Reduced APK size by 25% after removing Jackson-related overhead.
- Reduced unnecessary bandwidth use by tightening resource download rules.
- Improved request readiness by warming network connections before the critical request.
- One shared executor made unrelated tasks block each other.
- Handler-heavy scheduling made task flow harder to reason about.
- Time-sensitive work and blocking I/O were mixed together.
- Config, ads, I/O, and scheduled jobs all shared the same thread-pool lane.
- Queue contention increased latency even when individual tasks were small.
- Some non-UI work still depended on
Handlerdispatch instead of simpler executor-based scheduling.
- Split the single thread pool into isolated executors for config, ads, I/O, and scheduled work.
- Removed extra scheduling indirection where
execute()calls were stacked without real value. - Replaced selected
Handlerusage with thread-pool execution for non-UI tasks. - Reused message instances with
Message.obtain()in hot paths.
- Reduced cross-task interference between I/O, config, and ad work.
- Made execution order more predictable and easier to debug.
- Kept UI work on the main
Looperand moved blocking work into dedicated background lanes.
- Several hot paths were doing more CPU and allocation work than necessary.
- Reflection and regex added overhead in repeated operations.
- Database and cache behavior did not match real production access patterns.
- Gson-based parsing depended on reflection in stable, high-frequency paths.
- Regex was used for checks that only needed simple ASCII validation.
- Database writes were too fragmented.
- Cache policy was not layered enough for local and server-side behavior.
- Replaced Gson with hand-written JSON parsing in the hottest controlled-schema paths.
- Replaced simple regex checks with ASCII-based validation.
- Switched GreenDao operations to batch writes where possible.
- Built a multi-level cache with local memory, LRU eviction, and server-side policy coordination.
- Reduced CPU overhead in parsing and validation hot paths.
- Reduced allocation churn caused by reflection-heavy parsing.
- Improved database efficiency under bursty write patterns.
- Improved cache hit behavior by aligning local and server-side strategy.
This work also reflects production SDK design, not only app-side optimization.
- Built around a coroutine-first model so network and storage work stay off the main thread by default.
- Used structured error handling so callers receive typed failures instead of vague strings.
- Kept the architecture extensible so transport, auth, and retry behavior can evolve without rewriting the API surface.
- Designed integrations for production usage, where cancellation, host-app constraints, and backward compatibility matter as much as clean syntax.
- Reduced host-app overhead by avoiding repeated process checks and unnecessary framework lookups inside SDK code.
- Treated startup cost as an SDK design problem, not only an app integration problem.
- Remove work from the main thread first, then decide what truly belongs in startup.
- Prefer streaming over full buffering when payload size is unpredictable.
- Design for low-end devices and memory pressure, not only for modern flagship phones.
- Isolate unrelated workloads so one slow lane does not stall the rest of the app.
- Keep performance fixes measurable, narrow, and maintainable.
- Startup optimization
- ANR reduction
- Main-thread debugging
Looperand message queue behavior- Memory management
- OOM prevention
- Binder transaction limits
- WebView stability
- APK size reduction
- Network warm-up
- Resource download control
- Thread-pool design
- Data structure optimization
- Cache strategy
- SDK design
- Crash debugging