The memory wall was a  big problem since the 96[1] or even earlier.some form of 2.5D integration is difficult, but it doesn't seem as difficult running through moore's low, and it would probably have much lower capex.
Why is the industry only getting to it now, when moore's law is ending ? 
[1]http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=1563038&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D1563038
		
		
	 
This response is from Herb Reiter, an authority on the topic:
Interesting and very good questions. Here are some general and specific reasons I think were delaying it:
HISTORY:
-  Multi-chip modules are in principle just like Interposer-based 2.5D  designs, BUT way too expensive for broad deployment ==> Unit cost of  mounting dies side-by-side was too high
-  Miltary applications and space projects have been using memory cubes  for a number of years already ==> the assembly yields and unit cost  was very high and limited broad deployment
-  Graphics and video processing was, until a few years ago, considered  very high end and the component-cost and cooling requirements of GDDR  memories were tolerable for these low volume applications.
-  EDA, equipment and materials vendors had no motivation to support these  small markets and make the unit cost go down, the product quality go  up, the design process more use friendly,...  
MARKET/APPLICATIONS TRENDS
-  The markets for high-performance  graphics, networking and computing  applications have been growing rapidly in the last few years, as video  became widely popular (youtube,...) and high-performance graphics became  a must have for video games, computer displays, 4k TVs, Tesla's display  in the Model S and X,...
-  Video and graphics, and the needed program store and data buffers for  processing these data streams, requires a lot of memory; much more than  can be embedded in an SoC. The many DRAMs and Flash ICs needed take way  to much PCB space and power. 
-  In addition, portable applications grew very fast and demand lowest  possible power dissipation / long battery lives for success. Die-level  IP on interposers enable a 3 to 10 X power reduction versus individual  ICs on a PCB.
TECHNOLOGY LEADERS SETTING EXAMPLES:
- Xilinx' 4 FPGA slices in ONE IC package have demonstrate in 2011 how much power interposers can help saving.
- Micron and Intel promoted the HMC memory cubes, starting in 2012 and also demonstrated the value of multi-die integration.
-  Samsung's 3D V NAND monolithic Flash and the HBM for Hynix and Samsung  set a few more examples for higher levels of integration.
THE IoT BUZZ WORD ADDS IMPORTANCE:
-  Last, but not least, many of the emerging and very promising IoT  applications need the integration of heterogeneous functions - in a  small space and at low power. Intel's Curie module is NOT going to cut  it for these opportunities. 2.5/3D-ICs will win.
SoC TECHNOLOGY DOESN"T REALLY ALLOW TO INTEGRATE A SYSTEM ON A CHIP:
-  As you mentioned, the SoC NREs are going through the roof and design  times for these - primarily digital logic and RAM ICs - are measured in  years.
- The modularity, flexibility, IP-reuse friendly of interposer designs will win and allow integrating of Systems.
IMPORTANT FACT to keep in mind:  
-  Multi-die IC configurations will NOT REPLACE SoCs.   Every 2.5/3D-IC  will have an SoC - in die form - as its core. Die-level IP blocks, such  as memory cubes, radios, A/D converters, sensors,... will COMPLEMENT the  SoC in the core and make the 2.5/3D-IC "environmentally aware" and the  entire 2.5/3D-IC much more versatile, useful and valuable --- enabling  true "System Scaling" and increasing semiconductor vendors' margins.
These  are just a few thoughts to this topic.  The new edition of the  Multi-die IC Design Guide (2016.6, available at DAC) will address these  topics in much greater depth AND include 300++ pages of info about what  ~35 companies can do for Multi-die Integration.
Herb Reiter, President  eda 2 asic Consulting, Inc. since 2002 - Cell Phone 1 408 981 5831 
http://www.linkedin.com/pub/0/378/60  -  www.eda2asic.com - Office Phone 650 960 8578
Reducing Silicon - and Package design times, IC power dissipation and SYSTEM cost 
Recent IC and Packaging-related blogs: http://www.3dincites.com/author/herb-reiterieee-org/
No-charge download of the 2016.1 Multi-die IC Design Guide (300+ pages) at: http://esd-alliance.org/industry/publications